Threads for travis-bradbury

  1. 13

    Deno is an impressive project. But importing URLs rather than some abstraction (package names, “@vendorname/packagename”, reverse DNS notation, anything) is a no-go for me. I 100% do not want a strict association between the logical identifier of a package and a concrete piece of internet infrastructure with a DNS resolver and an HTTP server. No thank you. I hate that this seems to be a trend now, with both Deno and Go doing it.

    1. 8

      package names, “@vendorname/packagename” and reverse DNS notation, both in systems like Maven or NPM, are just abstractions for DNS resolvers and HTTP servers, but with extra roundtrips and complexity. The idea is to get rid of all those abstractions and provide a simple convention: Whatever the URLs is, it should never change it’s contents, so the toolchain can cache it.

      Any http server with static content can act as an origin for getting your dependencies. It could be,,, your own nginx instance, any other thing, or all of those options combined.

      1. 21

        Package identifiers are useful abstractions. With the abstraction in place, the package can be provided by a system package manager, or the author can change their web host and no source code (only the name -> URL mapping) needs to be changed. As an author I don’t want to promise that a piece of software will always, forevermore, be hosted on some obscure git host, or to promise that I will always keep a particular web server alive at a particular domain with a particular directory structure, I want the freedom to move to a different code hosting solution in the future, but if every user has the URL in every one of their source files I can’t do that. As a result, nobody wants to take the risk to use anything other than GitHub as their code host.

        With a system which uses reverse DNS notation, I can start using a library com.randomcorp.SomePackage, then later, when the vendor stops providing the package (under that name or at all) for some reason, the code will keep working as long as I have the packages with identifier com.randomcorp.SomePackage stored somewhere. With a system which uses URLs, my software will fail to build as soon as randomcorp goes out of business, changes anything about their infrastructure which affects paths, stops supporting the library, or anything else which happens to affect the physical infrastructure my code has a dependency on.

        The abstraction does add “complexity” (all abstractions do), but it’s an extremely useful abstraction which we should absolutely not abandon. Source code shouldn’t unnecessarily contain build-time dependencies on random pieces of physical Internet infrastructure.

        That’s my view of things anyways.

        1. 8

          As an author I don’t want to promise that a piece of software will always, forevermore, be hosted on some obscure git host, or to promise that I will always keep a particular web server alive at a particular domain with a particular directory structure. I want the freedom to move to a different code hosting solution in the future.

          Same applies to Maven and npm, repositories are coded into the project (or the default repository being defined by the package manager itself). If a host dies and you need to use a new one, you’ll need to change something.

          What happens if npm or stops responding? Everyone will have to change their projects to point at the new repository to get their packages.

          but if every user has the URL in every one of their source files I can’t do that. As a result, nobody wants to take the risk to use anything other than GitHub as their code host.

          In Deno you can use an import map (And I encourage everyone to do so): so all the hosts are in a single place, just one file to look at when a host dies, just like npm’s .npmrc.

          There are lockfiles, too:

          And another note: It’s somewhat typical for companies to have an internal repository that works as a proxy for npm/maven/etc and caches all the packages in case some random host dies, that way the company release pipeline isn’t affected. Depending on the package manager and ecosystem, you’ll need very specific software for implementing this (Verdaccio for npm, for example). But with Deno, literally any off-the-shelf HTTP caching proxy will work, something way more common for systems people.

          Source code shouldn’t unnecessarily contain build-time dependencies on random pieces of physical Internet infrastructure.

          That’s right, but there are only two ways to make builds from source code without needing random pieces of physical internet infrastructure, and these apply for all package management solutions:

          • You have no dependencies at all
          • All your dependencies are included in the repository

          The rest of solutions are just variations of dependency caching.

        2. 5

          Although this design is simpler, it has a security vulnerability which seems unsolvable.

          The scenario:

          1. A domain expires that was hosting a popular package
          2. A malicious actor buys the domain and hosts a malicious version of the package on it
          3. People who have never downloaded the package before, and therefore can’t possibly have a hash/checksum of it, read blog posts/tutorials/StackOverflow answers telling them to install the popular package; they do, and get compromised.

          It’s possible to prevent this with an extra layer (e.g. an index which stores hashes/checksums), but I can’t see how the “URL only” approach could even theoretically prevent this.

          1. 2

            I think the weak link there is people blindly copy-pasting code from StackOverflow. That opens the door to a myriad of security issues, not only for Deno’s approach.

            There are plenty of packages in npm with very similar names as legit, popular packages, but maybe just a letter, an underscore, or a number, differs. Enough for many people installing the wrong package, a malicious one, and getting compromised.

            Same applies to domain names. Maybe someone buys and just writes it as DEN0.LAND in a forum online because anyway domains are case-insentive and the zero can hide better.

            Someone could copy some random Maven host from StackOverflow and get the backdoored version of all their existing packages in their next gradle build.

            Sure, in that sense, Deno is more vulnerable because the decentralisation of package-hosting domains. It’s easier for everyone to know that the “good” domain is or If any host could be a good host, any host could mislead and become a bad one, too.

            For npm, we depend entirely on the fact that the domain won’t get taken over by a malicious actor without no one noticing. I think people will end up organically doing the same and getting their dependencies mostly from well-known domains as or, but I think it’s important to have the option to not follow this rule strictly and have some freedom.


            Apart from the “depending on very well-known and trusted centralised services” strategy, something more could be done to address the issue. Maybe there’s something about that in Deno 2 roadmap when it gets published. But fighting against StackOverflow blind copy-pastes is hard.

            1. 1

              What about people checking out an old codebase on a brand new computer where that package had never been installed before?

              I dunno, this just feels wrong in so many ways, and there are lots of subtle issues with it. Why not stick to something that’s been proven to work, for many language ecosystems?

              1. 1

                What about people checking out an old codebase on a brand new computer where that package had never been installed before?

                That’s easily solved with a lockfile, just like npm does:

                Why not stick to something that’s been proven to work, for many language ecosystems?

                Well, the centralized model has many issues. Currently every piece of software that runs on node, to be distributed, has to be hosted by a private company property of Microsoft. That’s a single point of failure and a whole open source ecosystem relying on a private company and a private implementation of the registry.

                Also, do you remember all the issues with youtube_dl on GitHub? Imagine something similar in npm.

                Related to the topic:

                1. 3

                  Good points those. I hadn’t considered that!

                  The single point of failure is not necessarily inherent to the centralized model though. In CHICKEN, we support multiple locations for downloading eggs. In Emacs, the package repository also allows for multiple sources. And of course APT, yum and Nix allow for multiple package sources as well. If a source goes rogue, all you have to do is remove it from your sources list and switch to a trustworthy source which mirrored the packages you’re using.

            2. 2

              Seems like you might want to adopt a “good practice” of only using URL imports from somehow-trusted sources, e.g. or unpkg or whatever.

              Could have a lint rule for this with sensible, community-selected defaults as well.

              1. 2

                If in the code where the URL is also includes a hash of the content, then assuming the hash isn’t broken, it avoids this problem.



                You get the URL and the hash, problem solved. You either get the code as proved by the hash or you don’t get the code.

                The downside is, you then lose auto-upgrades to some new latest version, but that is usually a bad idea anyway.

                Regardless, I’m in favour of 1 level of indirection, so in X years when Github goes out of business(because we all moved on to new source control tech), people can still run code without having to hack up DNS and websites and everything else just to deliver some code to the runtime.

                1. 1

                  This is a cool idea, although I’ve never heard of a package management design that does this!

                  1. 1

                    Agreed, I don’t know of anyone that does this either. In HTTPS land, we have SRI that is in the same ballpark, though I imagine the # or sites that use SRI can be counted on 1 hand :(

                2. 1

                  It’s sort of unlikely to happen often in Go because most users use as their URL. It could affect people using their own custom domain though. In that case, it would only affect new users. Existing users would continue to get the hashed locked versions they saved before from the Go module server, but new users might be affected. The Go modules server does I think have some security checks built into it, so I think if someone noticed the squatting, they could protect new users by blacklisting the URL in the Go module server. ( says to email if it comes up.)

                  So, it could happen, but people lose their usernames/passwords on NPM and PyPI too, so it doesn’t strike me as a qualitatively more dangerous situation.

                3. 2

                  Whatever the URLs is, it should never change it’s contents, so the toolchain can cache it.

                  Does Deno do anything to help with that or enforce it?

                  In HTML you can apparently use subresource integrity:


                  <script src=""

                  It would make sense for Deno to have something like that, if it doesn’t already.

                  1. 2

                    Deno does have something like that. Here’s the docs for it:

                    1. 2

                      OK nice, I think that mitigates most of the downsides … including the problem where someone takes over the domain. If they publish the same file it’s OK :)

                4. 2

                  While I agree with your concern in theroy, in the Go community this problem is greatly mitigated by two factors:

                  • Most personal projects tend to just use their GitHub/GitLab/… URLs.
                  • The Go module proxy’s cache is immutable, meaning that published versions cannot be changed retrospectively.

                  These two factors combined achieve the same level of security as a centralized registry. It is possible that Deno’s community will evolve in the same direction.

                  1. 1

                    change your hosts file

                  1. 2

                    Interesting, what do you mean by this is a better compromise for scripts? I’m not sure I see where this would be much different in that context.

                    1. 2

                      I’m working on a deployment tool and for example, if you want to use git and clone repo for the first time (from example from CI) you need to manually login into the server and run the ssh command to to update knopwn_hosts.

                      With accept-new this workflow is automated and no manual setup is needed.

                      1. 1

                        I imagine it’ll be better for scripts that issue multiple SSH commands. You can verify the remote end hasn’t changed host keys between the two (or more) invocations of SSH; whereas with no you just accept whatever the host key is whether it changes or not.

                        You can’t tell if the host changes between script runs but you can be sure the host hasn’t changed during the current run.

                        1. 4

                          I solve this in CI by putting the host’s fingerprint in a variable and writing that to known_hosts. I would think the odds of a host key changing in between commands of a job would be tiny, and the damage could already be done.

                          It’s still “trust on first use”, but that first use is when I set up CI and set the variable, not at the start of every job.

                          1. 3

                            I think this is the correct way to do it, I do this as well for CI jobs SSH-ing to longer-lived systems.

                            If the thing I’m SSHing into is ephemeral, I’ll make it upload its ssh host public keys to an object storage bucket when it boots via its cloud-init or “Userdata” script. That way the CI job can simply look up the appropriate host keys in the object storage bucket.

                            IMO any sort of system that creates and destroys servers regularly, like virtual machines or VPSes, should make it easy to query or grab the machine’s ssh public keys over something like HTTPS, like my object storage bucket solution.

                            I guess this is a sort of pet peeve of mine. I was always bugged by the way that Terraform’s remote-exec provisioner turns off host key checking by default, and doesn’t warn the user about that. I told them this is a security issue and they told me to buzz off. Ugh. I know its a bit pedantic, but I always want to make sure I have the correct host key before I connect!!! Similar to TLS, the entire security model of the connection can fall apart if the host key is not known to be authentic.

                          2. 2

                            Unless you’re clearing the known_hosts file (and if so, WTF), I don’t see why there would be a difference between consecutive connections within a script and consecutive connections between script runs.

                            1. 4

                              Jobs/tasks running under CI pipelines often don’t start with a populated known_hosts. Ephemeral containers too. Knowing you’re still talking to the same remote end (or someone with control of the same private key at least) is better than just accepting any remote candidate in that case.

                              Less “clearing known_hosts” file, more “starting without a known_hosts” file.

                        1. 3

                          There was that one particular feature of the language that impressed me back then and that was the day I acknowledged the flexibility of the PHP language, you can create an instance of a class and call a static method using an array!

                          class Foo {
                              public static function run($message) {
                          ["Foo", "run"]("BAR");
                          1. 5

                            For anybody as bewildered as I was (and still am), this is not some bizarre artifact of PHP implicit type conversions or weird object system; the “Callable array” syntax was explicitly introduced as a way of dynamically invoking static methods.

                            1. 2

                              In 8.1, releasing next week, there’s finally first-class callable supports, so strings and arrays aren’t necessary.

                              1. 1

                                create an instance of a class

                                Static methods don’t require instantiation. Are you sure instantiation is happening here?

                                Also, wouldn’t it be better to be able to write something like"BAR")?

                                1. 1

                                  You’re correct, this is a static method invocation. But this convention does also work for instance methods if you supply an object in the first array item instead of the class name.

                              1. 2

                                The part about annoying users by rejecting passwords reminded me of when I did tech support for a company that implemented zxcvbn.

                                First, you can accept the 1/99,857,412 chance of a false positive and block the user from using the password. This means that for every ~100,000,000 users that sign up to your service, you might annoy one of them.

                                It’s not really true that you’d annoy that user, since they don’t know that you lied about the password being pwned (and neither do you; it likely was pwned). More likely, you’ll annoy lots of users signing up, because you’re rejecting most of their passwords.

                                I spoke to people who were brought to tears, frustrated that they couldn’t pick a password that satisfied the website that used zxcvbn. Many others were angry, including one, who after many failed attempts to please the password checker, told me that nobody else would have that password because it was the name of his cat.

                                I don’t really have a point here, except that you can’t add a restriction, even a good one like “the password isn’t in the Pwned Passwords database”, without annoying people.

                                1. 3

                                  In my experience, using zxcvbn as training wheels to steer users towards better passwords (i.e. by simply displaying the derived “password strength” when creating a new password) works pretty well. Using zxcvbn’s score as a hard restriction is a choice of the implementor, not the library authors.

                                  1. 2

                                    I tried to implement zxcvbn for a client once and they begged me to replace it with some “simple rules” like “the password must contain a digit”. I told them that professional ethics wouldn’t allow me to do that, but I could just have no requirements instead since that has equivalent safety.

                                  1. 2

                                    My first impression was that I don’t like this. I’ve seen plenty of the miscommunication that they’re warning about, but why not solve it with good, old-fashioned sentences? If you want to say that something is not essential, you can say it casually, like you would in speech. It doesn’t have to be in a machine-friendly format. You’re not talking to a machine.

                                    After some thought, I’m a bit intrigued. It seems like the selling point is that the label acts as a prompt for the reviewer. In the first example, the real solution, as shown, was making the feedback actionable, not adding a label. But if labels remind people that their suggestion actually has to suggest something, that would be good.

                                    Still, it feels like a technical solution to a non-technical problem. If people don’t recognize that what they’re saying isn’t actionable, I’m not convinced that syntax rules will help them. They might need someone to clearly communicate to them that there’s a problem.

                                    Labels feel like declarations of a fact, which may be fine for “suggestion” but not for something like “non-blocking” or “optional”. Those are debatable. You should say why we should deal with it, or why we don’t have to. That encourages thinking about and discussing the idea. The author can contribute to that. Maybe they know something about the problem the reviewer doesn’t. Maybe neither person can decide whether the change is essential, and someone else should be asked. If I’m suggesting something and the only thing I say about the importance of it is “it doesn’t matter if we do this or not”, why waste people’s time and attention with the suggestion in the first place?

                                    1. 1

                                      The article is so light on detail that I can’t tell for sure, but it sounds like they’re just saying binaries were hosted on Azure infrastructure, not that they were necessarily being executed on Microsoft infrastructure. So they might not be any more infected than Google would be if I put a malicious binary in my Google Drive.

                                      1. 9
                                        0 == 'foobar' // false

                                        I thought I’d never see the day this would happen. It doesn’t completely get rid of the type juggling; 42 == " 42" is still true, but at least 42 == "42foo" is false now.

                                        1. 4

                                          I wonder how many applications that will break from this change alone.

                                          1. 3

                                            I don’t think it truly answers your question, but there was some work done to determine that.

                                   refers to, which found few problems, even in an older codebase.

                                            I haven’t looked at the mailing list discussions for the RFC, but if there was further research into that question, I’d expect to find it there.

                                            1. 4
                                        1. 14

                                          This is how I git, as a self-admitted near-idiot:

                                          • Never branch, always on m*ster.

                                          • Commit mainly from IntelliJ’s GUI.

                                          • Push either from IntelliJ or command line, can go either way.

                                          • On the server, git pull.

                                          • If there’s any trouble, mv project project.`date +%s` and re-clone.

                                          1. 8

                                            In my opinion people tend to pay too much attention to CLI commands and steps. As long as one understands what branches and commits are, it becomes immensly easier to handle git and potential problems.

                                            1. 1

                                              This is what I refer to as the “xkcd git workflow”:

                                              1. 1

                                                I feel like even people more used to git resort to the last bullet point every now and then, I know I have :P

                                                1. 3

                                         is a fantastic resource for fixing mistakes, which helps demystify got. It’s a ‘choose your own adventure’ guide where you decide what state you want to end up at and a few other facts, and it tells you what to do.

                                                  1. 1

                                                    First step

                                                    Strongly consider taking a backup of your current working directory and .git to avoid any possibility of losing data […]

                                                    Hehe, off to a good start. This basically sums it up though, that copy of a directory is a safety net in case any other steps go wrong.

                                                  2. 1

                                                    I admit I used it a lot at my university, because they didn’t taught us how git works and I didn’t took to the time to learn it on my own.

                                                    Now, when my local branch is mess, if I have no local changes to keep and I if know for sure that my branch is in a clean state on the remote repository, I just do:

                                                    git reset --hard origin/my-branch

                                                    With the years passing, it appears to me that you don’t end up with this “fak I have to reclone my repo” or “fak I don’t know how to fix this conflict” problems if you are meticulous with what you commit and where.

                                                    It take a bit more time upfront to make commit that you are proud of, but in the end it makes it very easy to understand what you have done some days/weeks/month ago (and it will save your ass when you have to find when a regression/bug happened).

                                                    TL;DR: git flow + self-explanatory commits = <3

                                                    1. 1

                                                      Oh man! I did this two weeks ago. I had folders numbered 1-n and in each one I had the same project cloned but in a messed up state. Granted that it was a new technology stack for me, nodejs to be precise.

                                                  1. 4

                                          , “Unit Testing is Overrated”, discusses the same topic but argues for more coverage by functional tests and less by unit tests. Both articles are really about the value that types of tests have, and balancing the amount of each to be more useful.

                                                    1. 1

                                                      It is a question I’ve always had, is there any general rules/guidelines people follow as to what type of applications should have certain types of tests? Or is it similar to programming paradigms like OO versus functional, where it depends on the programmers involved and what their preferences are?

                                                    1. 4

                                                      I am sorry, but if you still do not know about multi-byte characters in 2020, you should really not be writing software. The 1990ies have long passed in which you could assume 1 byte == 1 char.

                                                      1. 28


                                                        Nobody was born knowing about multi-byte characters. There’s always new people just learning about it, and probably lots of programmers that never got the memo.

                                                        1. 5

                                                          The famous joel article on unicode is almost old enough to vote in most countries (17 years). There is really no excuse to be oblivious to this:

                                                          This is esp. problematic if you read the last paragraph where the author gives encryption/decryption as an example. If somebody really is messing with low level crypto apis, they have to know this. There is no excuse. Really.

                                                          1. 10

                                                            Junior Programmers aren’t born having read a Joel Spolsky blog post. There are, forever, people just being exposed to the idea of multibyte characters for the first time. Particularly if, like a lot of juniors are, they’re steered towards something like the K&R C book as a “good” learning resource.

                                                            Whether or not this blog post in particular is a good introduction to the topic is kind of a side-point. What was being pointed out to you was that everyone in 2020 is supposed to have learned this topic at some point in the past is beyond silly. There are always new learners.

                                                            1. 4

                                                              You are aware that there are new programmers born every day, right?

                                                            2. 4

                                                              Right, but this author is purporting to be able to guide others through this stuff. If they haven’t worked with it enough to see the low-hanging edge cases, they should qualify their article with “I just started working on this stuff, I’m not an expert and you shouldn’t take this as a definitive guide.” That’s a standard I apply to myself as well.

                                                              1. 2

                                                                We should perhaps not expect newcomers to know about encoding issues, but we should expect the tools they (and the rest of us) use to handle it with a minimum of bad surprises.

                                                              2. 8

                                                                That’s a little harsh, everyone has to learn sometime. I didn’t learn about character encoding on my first day of writing code, it took getting bitten in the ass by multibyte encoding a few times before I got the hang of it.

                                                                Here is another good intro to multibyte encoding for anyone who wants to learn more:

                                                                1. 3

                                                                  I didn’t learn about character encoding on my first day of writing code, it took getting bitten in the ass by multibyte encoding a few times before I got the hang of it.

                                                                  Right, but you’re not the one writing and publishing an article that you intend for people to use as a reference for this type of stuff. People are responsible for what they publish, and I hold this author responsible to supply their writing with the caveat that their advice is incomplete, out-of-date, or hobby-level—based, I presume, on limited reading & experience with this stuff in the field.

                                                                2. 8

                                                                  I’m sure that if I knew you well enough, I could find three things you didn’t know that respected developers would say means “you should really not be writing software”.

                                                                  1. 3

                                                                    Yes it’s 2020, but also, yes, people still get this wrong. 90% of packages addressed to me mangle my city (Zürich) visibly on the delivery slip, so do many local(!) food delivery services.

                                                                    Every time I make a payment with Apple Pay, the local(!) App messes up my city name in a notification (the wallet app gets it right).

                                                                    Every week I’m handling support issues with vendors delivering data to our platform with encoding issues.

                                                                    Every week somebody in my team comes to me with questions about encoding issues (even though by now they should know better)

                                                                    This is a hard problem. This is also a surprising problem (after all „it’s just strings“).

                                                                    It’s good when people learn about this. It’s good when they communicate about this. The more people write about this, the more will get it right in the future.

                                                                    We are SO far removed from these issues being consistently solved all throughout

                                                                    1. 2

                                                                      I know all that. My first name has an accented character in it. I get broken emails all the time. That still does NOT make it okay. People that write software have to know some fundamental things and character encodings is one of them. I consider it as fundamental as understanding how floats work in a computer and that they are not precise and what problems that causes.

                                                                      The article being discussed is not good and factually wrong in a few places. It is also not written in a style that makes it sound like somebody is documenting their learnings. It is written as stating facts. The tone makes a big difference.

                                                                    2. 2

                                                                      There’s a difference between knowing there’s a difference, which I suspect is reasonably common knowledge, and knowing what the difference is.

                                                                      1. 2

                                                                        There are very few things that every software developer needs to know–fewer than most lists suggest, at least. Multi-byte encodings and unicode have are about as good a candidate as exists for being included in that list.

                                                                        However, people come to software through all different paths. There’s no credential or exam you have to pass. Some people just start scripting, or writing mathematical/statistical code, and wander into doing things. Many of them will be capable of doing useful and interesting things, but are missing this very important piece of knowledge.

                                                                        What does getting cranky in the comments do to improve that situation? Absolutely nothing.

                                                                        1. 3

                                                                          There’s no credential or exam you have to pass.

                                                                          I think that this is one of the problems with our industry. People with zero proof of knowledge are fuzzing around with things they do not understand. I am not talking about hobbyists here, but about people writing software that is being used by people to run critical infrastructure. There is no other technical profession where that is okay.

                                                                          1. 2

                                                                            I think our field is big and fast enough that dealing with things we don’t understand don’t yet understand has just become part of the job description.

                                                                      1. 3

                                                                        This isn’t a zero day. It’s known by Apple, and presumably already addressed. It doesn’t really say so, except it refers to the exploit in the past tense.

                                                                        Apple also did an investigation of their logs and determined there was no misuse or account compromise due to this vulnerability.

                                                                        1. 7

                                                                          It was a zero-day when he took it to Apple, otherwise they would be suing him for $100,000,000 for publishing this article instead of having rewarded him with $100,000.

                                                                          1. 1

                                                                            No, not really. A “zero day” is an vulnerability that has not yet been patched by the vendor. There’s no vendor here. It’s not like Apple was waiting on Nginx to release a patch; Apple is both the author of the software and the affected party. When they were notified, they patched; as soon as the patch was put in place, all “users” of the software were instantly patched—because there was only one, which was Apple.

                                                                            The premise of a zero day is that you can find a vulnerability in a product and then exploit it in all the places it is used. That’s what makes them interesting.

                                                                            1. 0

                                                                              all the places used

                                                                              All the places that use Sign In With Apple. Just because only one centralized server deployment was vulnerable doesn’t mean it wasn’t exploitable all over the web

                                                                              This is a really stupid thing to argue over, and you’re wrong anyway

                                                                          2. 3

                                                                            There are two definitions of 0-day:

                                                                            • The vulnerability is known outside of the vendor before the patch is released.
                                                                            • The vulnerability is actively exploited before the patch is released.

                                                                            This definitely meets the first definition. The second is less useful: you can sometimes show that something is a zero-day according to this definition but you can rarely show that something isn’t.

                                                                            1. 2

                                                                              I assume that refers to the steps apple took after they verified this bug was valid, however I agree - there is no indication per this article that this is a zero day

                                                                            1. 4

                                                                              I find that my single most useful skill as a programmer, excluding any and all social aspects of software being a community endeavor, is my ability to read, understand, and navigate code that is not mine. I put those three together because to trace a program, you need to be able to understand what file to jump to next. Also to know what to change in the program such that it can do what you mean it to, without looking like the Potato Jesus restoration project.

                                                                              1. 1

                                                                                For those to whom “Potato Jesus Restoration” is not obvious, it’s a reference to a botched restoration of a painting.,_Borja)#Failed_restoration_attempt_and_internet_phenomenon

                                                                              1. 9

                                                                                Yes, this is also a good page about the issue, and related issues:


                                                                                As some might know, is very much about proper parsing and string safety.

                                                                                Code and data are kept separate. For example:

                                                                                • Shell arithmetic expansion conflates data and code, but Oil statically parses arithmetic expressions instead.
                                                                                • Shellshock was also about the conflation of data and code, with the export -f misfeature that serialized a function to a string!

                                                                                However I don’t really have answer to this flag vs. arg issue (flag being code, and arg being data), other than “use -- everywhere”.

                                                                                Relatedly, in Oil, echo is no longer a special case because it supports -- (if you use bin/oil rather than bin/osh). So you can do echo -- $x, which is a safe version of echo $x.

                                                                                • In POSIX shell you have to use printf to echo an arbitrary string, although not many shell scripts follow that discipline!
                                                                                • You can’t use echo -- $x in POSIX shell because that would output 2 dashes also :)

                                                                                If anyone has a better idea than “use -- everywhere”, let me know :)

                                                                                I guess you write shell in two modes: the quick and dirty throwaway, and then Oil is supposed to allow you to upgrade that to a “real” program. So there could be a lint tool that warns about -- or just auto-refactors the script for you. Or maybe there is some option that breaks with shell more radically.

                                                                                1. 4

                                                                                  If you ever write a script that operates on untrusted files, always make sure the command will do exactly the thing you wanted it to do.

                                                                                  The problem is that when you ask yourself “does this do exactly what I want it to?”, you don’t have the imagination to come up with something like “filenames that look like flags will be interpreted as one.”

                                                                                  Someone who would make a safer, saner shell to write programs with would be a hero.

                                                                                  1. 3

                                                                                    I started some work on a safer, more explicit shell, realizing that the fundamental offering of any shell is the ability to just type the name of a function and arguments unadorned. I called this “quotation”.

                                                                                    However, after thinking about it more, I realized that no solution will solve the dissonance inherent to any language like that. You will always be tripping up and compromising over where something is or is not quoted. All templating languages have this problem.

                                                                                    Instead, I’m currently of the opinion that an approach more like PowerShell, in which you call functions, not write the names of programs and arguments as text, is the right way forward. This removes the problem of quotation. The downside to this approach is that it requires work to produce the APIs. It’s fine if you have a large standard library, as PowerShell does, but being able to pull a binary off the shelf e.g. one written in e.g. Python or C should still be natural.

                                                                                    The missing part therefore, in my opinion, is that programs (in any system be it Linux, OS X, Windows, BSD), ought to be accompanied by a schema (could be in JSON, doesn’t matter), let’s say git and git.schema, which can be interpreted offline or “cold” – without running the program (very important) –in order to know (1) arguments/flags the program accepts (as commands or switches), (2) the types/formats of those inputs, (3) possibly list the side-effects of those commands.

                                                                                    This allows a shell or IDE to provide a very strong completion and type-checking checking story, and to provide it out of the box. A projectional editor would be satisfying here, too (even something as simple as CLIM’s listener).

                                                                                    When downloading a random binary online, you could additionally download the schema for it. The schema file itself can contain a SHA256 of the binary that it’s talking about, to avoid accidental misuse. Currently if you want completion for an exe, you have to generate some bash. So it’s clear that the need is already there, it’s just implemented in a poorly done way.

                                                                                    The upside to this approach is that it’s additive, no one has to change their existing software. Additionally, it’s easy to produce for old software; You can make a parser for --help or man pages to generate a “best guess” schema for a program. The reason you wouldn’t put this into a shell by default would be because some programs don’t accept --help and/or they run side effects and delete things. Existing opt parser libraries can generate such schemas, just like some of them can generate bash at the moment.

                                                                                    Another upside is that it can simply be a standard/RFC published freely, just like JSON.

                                                                                    I haven’t moved forward on this schema idea yet, because it’s hard to research existing solutions (hard to google for). It would be an FFI standard but instead of calling C functions you’re calling processes.

                                                                                    1. 2

                                                                                      Yeah I agree with your diagnosis of the issue. You need a schema and a parser to go along with every command.

                                                                                      I say a parser because in general every command has its own syntax, in addition to semantics (the types of the arguments). Some commands recognize fully recursive languages in their args like find, test, and expr. (Although of course there are common conventions.)

                                                                                      It relates pretty strongly to this idea about shell-agnostic autocompletion we’ve been kicking around. Every command needs a parser and a schema there too.


                                                                                      And yes it’s nice to have the property that it’s a pure addition. It’s basically like TypeScript or MyPy (which I’m using in Oil). They handle a lot of messiness to provide you with a smooth upgrade path.

                                                                                      If you’d like to pursue these ideas I think Oil will help because it has an accurate and flexible shell parser :) I break shell completion into two parts: (1) completing the shell language, then (2) completing the language of each individual tool. That’s the only way to do it correctly AFAICT. bash conflates the two problems which leads to a lot of upstream hacks.

                                                                                    2. 1

                                                                                      That’s really interesting, especially thanks for the dwheeler page.

                                                                                      I solve shell problems by using shellcheck all the time, it catches most mistakes one can make and it itegrates nicely into existing editors.

                                                                                      Oil looks really interesting and is certainly better than the shell+shellcheck combo, but I don’t think I want to write everything in new syntax that is not universal and might mean I’ll have to rewrite it in bash later anyway.

                                                                                      1. 3

                                                                                        Well Oil is very bash compatible – it runs your existing shell/bash scripts. You can use it as another correctness tool on top of ShellCheck. It catches problems at runtime in addition at parse time.

                                                                                        Example: Oil’s Stricter Semantics Solve Real Problems

                                                                                        There are a bunch of other examples I should write about too.

                                                                                        If your shell scripts are used for years, and keep getting enhanced, chances are that you will run into the limitations of bash. And you won’t want to rewrite the whole thing in another language. That’s what Oil is for :) It’s fine to stick with bash now, but as you use shell more and more, you will run into a lot of limitations.

                                                                                      2. 1

                                                                                        Yes, this is also a good page about the issue, and related issues:

                                                                                        I disagree with all items marked as “#WRONG” in this site. Clean globbing is too powerful and beautiful to complexify with these mundane problems. Filenames are variable names, not data, and you get to chose them. What is actually wrong is the existence of files with crazy names. This should be solved at the filesystem level by disallowing e.g. the space character on a filename (no need to bother the user, it could be translated simply into a non-breaking space character, and nobody would notice except shell scripts).

                                                                                        1. 3

                                                                                          Filenames are variable names, not data, and you get to chose them

                                                                                          The problem is you don’t always; if I write a script and you run it, then the author of the script didn’t choose the filenames.

                                                                                          What is actually wrong is the existence of files with crazy names. This should be solved at the filesystem level

                                                                                          That is more or less the same conclusion the author makes:

                                                                                        2. 1

                                                                                          a better idea than “use – everywhere”

                                                                                          An environment variable indicating the providence of each argument, and sending patches to getopt, argp, coreutils, bash, and your favorite unix programs.

                                                                                        1. 2

                                                                                          In the sad reality outside the OSS projects, no one has a time for that (or at least thinks so), but more importantly it’s just not being taken seriously or even read, as people mostly just skim over the headers of git log.

                                                                                          I tried to show or apply many “good practices” or other nice patterns in terms of source control (and code management in general). It always fails, the amount of “black matter developers” (look up “the world runs on Java 8” article) is way larger than people can imagine. And I’m not okay with that, even making talks and workshops does not help at all - devs just want to click ok and get on with things

                                                                                          1. 6

                                                                                            In the sad reality outside the OSS projects, no one has a time for that (or at least thinks so), but more importantly it’s just not being taken seriously or even read, as people mostly just skim over the headers of git log.

                                                                                            I can’t count the number of times where I saw some seemingly code at work that’s buggy which I don’t understand at all. I’m so happy when a quick “git blame” and showing of the commit message tells the story of why this code is the way it is. At the very least the commit message should include a ticket number, but ideally it explains why the change was made. This context is often exactly what you need to fix the code to do what was intended. Without such context, I’m sure I would mess things up (or undo some attempted half-fix from a colleague, re-introducing the original bug).

                                                                                            For this reason (and often times, I’m the guy who has to dig back and then finds his own commit and thinks “oh yeah, that’s what was going on”) I try to write good commit messages. It’s the neighbourly thing to do for your fellow committers (and yourself), and has little to do with OSS versus proprietary code. If you spend two or three hours fixing a bug or implementing a new feature, I think it’s warranted to spend 5 to 10 minutes crafting a good commit message. And even if it only took you 5 minutes, if you have even the slightest hunch that someone not as deep into the code base as you would need more time to understand, just make the effort.

                                                                                            Even if 90% of the commits don’t get read like a novel, that 10% of commits surrounding problematic or tricky code are worth it.

                                                                                            1. 3

                                                                                              The mentioned article that uses the term “black matter developers” is

                                                                                              1. 1

                                                                                                The term is “dark matter”, a reference to the concept from physics, not “black matter”, even though black is the darkest possible colour & I can see how this could be confusing to somebody for whom English is not their first language.

                                                                                              1. 4

                                                                                                Interesting, but if that is the actual lock (from their twitter video) then it seems like it would be trivial to cut with bolt cutters.

                                                                                                1. 3

                                                                                                  Sure, but that isn’t quite as elegant is it? :)

                                                                                                  1. 4

                                                                                                    Nope, but definitely quicker, cheaper, and simpler :D

                                                                                                    I really enjoyed your write-up of this, but I also (strangely, enjoy) find that Occam’s razor is still relavent sometimes.


                                                                                                  2. 3

                                                                                                    I’m reminded of the fingerprint lock that is “invincible to the people who do not have a screwdriver”.

                                                                                                    1. 2

                                                                                                      That one is also distinguished by the way it broadcasts its key to everything nearby and that they made the location of every lock publicly accessible online

                                                                                                  1. 4

                                                                                                    Oh boy. All thos $variable $names. PHP is so weird if you’re not used to it anymore!

                                                                                                    And yeah, what @bram said. Things like addresses and names should ideally not be touched. The only thing you can maybe get away with is upcasing a last name which I’ve seen some people do, but apart from that, better don’t touch it.

                                                                                                    1. 4

                                                                                                      “Last name” doesn’t always make sense

                                                                                                      1. 3

                                                                                                        The only thing you can maybe get away with is upcasing a last name which I’ve seen some people do

                                                                                                        Consider the family name “van de Velde”. You really can’t get away with anything because it’s just not how things work outside the English-speaking bubble.

                                                                                                      1. 3

                                                                                                        The title should really say that unprivileged systemd users can execute systemctl commands. Not all Linux users are affected.

                                                                                                        1. 3

                                                                                                          The problem is located in polkit, and that’s where the fix is. You don’t need systemd to be vulnerable.

                                                                                                          1. 3

                                                                                                            Ok, then it should say “PolicyKit has a bug handling UID > INT_MAX” and be done with it. (Regardless of where the bug is, unprivileged users can’t execute arbitrary systemctl commands if systemctl isn’t installed. It’s part of Systemd). And: I don’t think PolicyKit is Linux-only, strictly speaking, though hopefully none of the other OSes use it by default.

                                                                                                            Point was: the bug is not a Linux bug and the title is misleading. I run Linux with neither Systemd nor PolicyKit and I’m not affected.

                                                                                                            1. 2

                                                                                                              The headline would be of far less use to some of the if it just talked about PolicyKit. I have no idea that PolicyKit is so mentioning systemctl tells me this news is something to look at. I’m making an assumption that knowing systemctl but not PolicyKit is common; I’m confident it is among my co-workers.

                                                                                                              1. 4

                                                                                                                You could say “bug in PolicyKit allows running arbitrary systemctl commands”, it would be just as brief, just as informative, and would actually be accurate.

                                                                                                        1. 7

                                                                                                          The image of a squirrel scraped off the pavement is pretty gruesome, but for an alternate look at our fuzzy nut-stashing friends, there are many a recipe from Hank Shaw and others.

                                                                                                          1. 2

                                                                                                            A cool thing about this article is that I feel like I understand why they’re moving away from Drupal after reading it, which is really unusual for a headline starting with “why”.

                                                                                                            There are 50% fewer modules actively maintained for Drupal 8 than 7.

                                                                                                            That’s not necessarily a bad thing. One area there are fewer modules is around Drupal Commerce and that’s by design. When re-writing it for Drupal 8 it was made to support the use cases of the most common Commerce modules from Drupal 7, cutting down on the number of extras you need to install.

                                                                                                            it now has an even steeper developer learning curve.

                                                                                                            I’m not sure that I buy this claim. Anything I’ve wanted to do in Drupal 8 has been a lot easier to discover than equivalents in 7. There’s also a lot less to learn to accomplish certain tasks. I generally find that if I use a plug-in to do something I can just worry about my one thing but in Drupal 7 I’d be tripping over issues brought up by every other module that implements the same hook until I somehow hold an understanding of every piece of the system in my head.

                                                                                                            Drupal is increasingly moving to an enterprise space, making it a more questionable value proposition for not-for-profit organisations.

                                                                                                            I definitely agree with this point. I’m not sure that a Drupal site is more expensive now but there does seem to be more use cases requiring writing code than just installing a module. The trade-off is you were installing a module that doesn’t quite do what you want, maybe affects things you didn’t want to change, probably has XSS vulnerabilities (hopefully mitigated by needing admin privileges) or ruins cacheability or worse. Someone hoping to have a site by just installing modules they want as a mostly “non-technical” user or having a barely knowledgeable “site-builder” manage it is probably less happy with Drupal nowadays but as someone happy programming to do what I want I look at sites stitched together out of low quality modules as a house of cards in a breeze.

                                                                                                            1. 4

                                                                                                              It lets a maximum of control to the developer. The developer is responsible of the code that it produces.

                                                                                                              In an article that seems to be contrasting static and dynamic typing, this strikes me as wrong. The alternatives (to duck typing) don’t lack control.

                                                                                                              1. 3

                                                                                                                Speaking in vague enough terms allows one to seldom be wrong.

                                                                                                                That’s a reason why you should always be wary of an “engineer” describing a technology as “expressive”, or “it gives me freedom/control”.