1. 29

When Azer unpublished all of their npm packages, I found this comment on the original issue thread. Then, I got reminded of the rimrafall npm package which brought me into writing this post.

IMO, for a more simpler way to deal with this, there shouldn’t even be an “unpublish” option on npm in the first place.

  1.  

  2. 11

    IMO, for a more simpler way to deal with this, there shouldn’t even be an “unpublish” option on npm in the first place.

    There has to, because of e.g. DMCAs, accidental breaches.

    What definetly shouldn’t be the case is that you have to put trust into the maintainers of so many small packages when installing a larger npm package. This is an argument for rich stdlibs: You don’t have to review every trivial piece of code you depend on, and trust so many parties.

    And yet the latest trend seems to be to migrate away from richer stdlibs. Python’s stdlib is enormous, and for a long time it’s been good enough such that the packaging story didn’t have to be that good. There are still a few Python projects out there that try to stay dependency free because they still don’t trust Python packaging (requests, Django). Instead they manually vendor every piece of software they depend on. This obviously complicates software maintenance. The JavaScript and Rust ecosystems don’t have this problem because their stdlibs are small and their package managers excellent from the start.

    I think the best of both worlds would be to “put the stdlib into the package index”, in the form of a collection of “officially sanctioned” packages maintained by a single party. That would probably avoid both the dependency problem Python faces (although it’s definetly getting better), and also the trust problem the JS ecosystem currently has.

    1. 10

      I think the best of both worlds would be to “put the stdlib into the package index”, in the form of a collection of “officially sanctioned” packages maintained by a single party.

      We are on track to do this with Rust. Our std is indeed small (relatively), but we’re also looking to maintain a set of “blessed” crates that are widely used. Right now, our process is for these crates to start out in the rust-lang-nursery, and then move to rust-lang after an RFC has been written for them (and moved to 1.0).

      1. 1

        Thank you for clarifying that – I’ve seen rust-lang-nursery but wasn’t sure how “official” those packages were.

      2. 5

        I think the best of both worlds would be to “put the stdlib into the package index”, in the form of a collection of “officially sanctioned” packages maintained by a single party. That would probably avoid both the dependency problem Python faces (although it’s definetly getting better), and also the trust problem the JS ecosystem currently has.

        Does JS actually have a trust problem, or is this just a theoretical concern at present? (Not saying we shouldn’t address it, but it’s a matter of priorities).

        I think you have little need for a stdlib per se - this is a debate that’s actively going on in scala where pieces of the standard library are being deliberately carved out and separated into standalone things.

        What you might want is good curation, maybe even verified through code signing - but I’m not really that sure what problem that would solve? In the maven world it looks like it’s possible to require all your dependencies to be signed by a whitelist of GPG keys. I’ve never seen anyone bother doing this in practice though.

        1. 5

          On the topic of stdlib vs not, I think it’s important for there to be something called the stdlib to have important things in it. I don’t know how big it should be, but in Ocaml-land we currently have the stdlib, which is near useless, and 2 competing stdlib replacements. We also have 2 cooperative threading frameworks which are incompatible and neither of which are really a good foundational library. I’d like to have something in the stdlib, at the very least an interface. Right now anyone can make their own threading library and convince themselves that it is a good idea.

          1. 2

            I think even interfaces need to evolve, and forcing a bunch of unrelated interfaces to share a common release lifecycle is a bad idea. Compare Python where a lot of code is written using poor (or at least, less good than what’s available under modern practice) interfaces for http/subprocesses/etc. because the poor interface was what was standardised for the stdlib a long time ago and can only be changed occasionally.

            IMO: We do need a good way to encourage multiple implementations of the same idea to share interfaces, and to either discourage proliferation of multiple libraries for the same problem or have good enough curation that this isn’t a problem when selecting libraries in practice. But at the same time we definitely need the path to building a replacement for libwhatever with a better interface to remain open, even when whatever is fairly basic/foundational functionality.

          2. 2

            Does JS actually have a trust problem, or is this just a theoretical concern at present? (Not saying we shouldn’t address it, but it’s a matter of priorities).

            I don’t know how to answer this question, because I don’t have an overview over npm’s issue tracker/kanban/whatever :)

            What you might want is good curation, maybe even verified through code signing - but I’m not really that sure what problem that would solve? In the maven world it looks like it’s possible to require all your dependencies to be signed by a whitelist of GPG keys. I’ve never seen anyone bother doing this in practice though.

            The problem is trusting the author of the package, which is a problem no code signing solution I know of is designed to solve. The simplest remedy I know is reducing the amount of parties in used third-party code, and the example that comes to my mind is a stdlib.

            1. 3

              The problem is trusting the author of the package, which is a problem no code signing solution I know of is designed to solve. The simplest remedy I know is reducing the amount of parties in used third-party code, and the example that comes to my mind is a stdlib.

              Tossing a bunch of one-liner packages together into a stdlib doesn’t make those one-liners have been written by fewer people. What are the guarantees you actually get about a line of code from the fact that it’s in the stdlib? They’re something like “was reviewed by at least two people from this list of stdlib maintainers”, right? Which you could equally well represent with code signing in a package management system.

              1. 2

                What are the guarantees you actually get about a line of code from the fact that it’s in the stdlib?

                I get to trust one organization, instead of a bunch of independent contributors. I get synchronized release cycles and features being rolled out across the whole stdlib at once. I get an implicit promise that parts of the lib will not disappear or change API without warning, thus breaking other parts of the lib. I hopefully get a system evolving in a clear direction. I get consistency.

                1. 1

                  Now I understand what you mean. Yes, that’s also a possibility.

            2. 1

              This obviously complicates software maintenance

              I’m not sure it does? Every proprietary piece of software I’ve worked on takes the “copy into a vendor directory”, and it’s ended up being incredibly simplifying for most developers.

              1. 2

                Simplifying what?

              2. 1

                Unpublishing a package is the nuclear option. IMO it should require a manual admin action to do so. This is how it works for Clojars which I help administer.

              3. 12

                It’s 2016 and Node’s developers haven’t figured out yet that global names are never a good idea. Using UUIDs or as the article suggest namespaces would be a straightforward solution, but it’s amazing that they didn’t do that from the start.

                1. 4

                  What exactly would namespacing have solved in this instance?

                  Let’s review what happened (within the package repo):

                  • current version of left-pad is unpublished
                  • npm decides break their own rules to avoid breakage: re-publishes old version (normally impossible) under new author

                  If this scenario happened with namespaced package, what would stop npm from breaking their own rules again, and transfer ownership of the (now namespaced) package?

                  1. 6

                    Namespacing would prevent someone from republishing a malicious new version to anyone who uses caret dependencies.

                    Hell, namespacing could have prevented this entire fiasco since kik would really only need ownership over the “kik” namespace, not every package named kik.

                    1. 6

                      Namespacing would prevent someone from republishing a malicious new version to anyone who uses caret dependencies.

                      Again, if I unpublish a package from my namespace, and npm decides to break the rules to avoid massive breakage – what has changed?

                      Also keep in mind that I’m not arguing against namespacing in general, but I don’t think it would’ve helped with anything in this case.

                      Hell, namespacing could have prevented this entire fiasco since kik would really only need ownership over the “kik” namespace, not every package named kik.

                      IANAL but that seems speculative to me.

                      EDIT: See this DMCA for GitHub, which does have namespaces: https://github.com/github/dmca/blob/master/2014-02-12-WhatsApp.md

                      1. 4

                        No one can stop npm from doing whatever, but namespacing done by npm username would prevent a malicious third-party from uploading a new version of the package. I’m saying that is an improvement, not that it would have changed anything.

                        And yes, kik could still send a DMCA, but they would have much less reason for doing so. In this case, they probably wanted to publish a module for accessing their API and found that the logical name was already in use.

                        1. 1

                          namespacing done by npm username would prevent a malicious third-party from uploading a new version of the package

                          Something that hasn’t happened, and as far as I can see doesn’t happen. ~/^ dependencies are always a tradeoff and it’s not clear to me that upgrading to newer versions after a package was handed over to a third party isn’t the intended behaviour in that case.

                  2. 2

                    It’s 2016 and people are still using NPM as a build tool even after being explicitly told not to deserve their builds breaking because of political shit like this.

                  3. 4

                    The manifest listing dependencies should at least contain the hash of each dependency (SHA256 for example) and check it.

                    1. 1

                      My sense is there shouldn’t be an unilateral “unpublish” button if your package is named as a dependency by other packages. If it is named as a dep, you as the developer and package owner should be able to serve a deprecation notice that lasts some quantum of time on the order of weeks so that package maintainers upstream can replace the depended code. For something like “padLeft” it’s trivial to rewrite, but for a package that’s a bit more complicated it could be much worse.

                      I still enjoy the democratic nature of NPM and that there is tons of flexibility in which libraries you choose all the way down to what string manipulation libraries you go with. This single point of failure issue is very bad, however, and at least some thought should have been put into what happens if a developer yanks a low level, heavily depended on package.