Never again will we delete a module. Until next time.
npm: the ethereum DAO of package managers
Normalization of deviance is a thing. I can absolutely sympathise with seeing a completely non-functional package and seeing no need to follow the full clunky process this time.
Wasn’t the process there precisely to prevent situations like this? That’s why we have processes, because people making adhoc decisions make mistakes.
Sure. Problem -> impose process to fix problem -> problem caused by working around the process -> realise the process really is important -> problem is now actually resolved is a sequence that happens fairly often though. “Never again until next time” is excessively pessimistic.
I think I’d be more optimistic if it’d been at least six months since the last incident. (But we did make it five whole months. So close!) Or if their track record was better than a seeming 0/1. Or if this had been some truly bizarre edge case. Or if the rationale for not following process had been somewhat meatier than “I didn’t think”.
I use a very naive predictor: whatever happened last time will happen next time. It’s actually pretty accurate.
The great thing about Unangst Learning is that you don’t need a fleet of GPUs or Hadoop. You just need a stack of size 1. I predict a fleet of new startups in the UL sector shortly.
That’s a rephrasing of a popular psychology maxim: The best predictor of future behaviour is past behaviour.
…the “fs” package is a non-functional package. It simply logs the word “I am fs” and exits. There is no reason it should be included in any modules. However, something like 1000 packages do mistakenly depend on “fs”, probably because they were trying to use a built-in node module called “fs”.
How does this even happen - it seems like a MITM attack waiting to happen… Not to mention the rather scary thought that over 1,000 packages have been published with a useless dependency and the authors don’t even realise it. I don’t know what that says about the quality of a lot of npm modules…
A big reason this happens is that you have a lot of micro-dependencies , so you have tools that will manage your dependency file “automatically”. So if one package ends up making the mistake, then it spreads virally.
I would love there to be a bot just auditing every package.json on github and sending PRs to remove useless packages from their dependencies.
I really hate getting PR spam from bots like that.
Interesting, do you get a lot of them? Do you dislike them because they are wrong?
That would be easy if there was a list of useless dependencies. Does that exist in any form?
Not quite what was asked for, but there is this tool which checks for unused dependencies. https://github.com/depcheck/depcheck
Besides my subjective opinions, I think it’s a big enough project to earn its own tag
Propose that in a thread tagged meta, so the community can decide. It will help to explain why, and to provide links to existing posts that would be tagged with it.
Thanks, done! https://lobste.rs/s/j2bxnz/meta_nodejs_tag
This is one of the reasons why I love having a minimal amount of external dependencies for stuff, so I can read through the code of what I import and understand what it does. Not easily doable with nodejs.
I don’t think it’s not easily doable with node. You can still write code with minimal external dependencies, there just happen to be a fairly large number of people who don’t.
I would argue that this removal (along with a bunch of stuff that happened recently) is indicative of a greater node community push to stop relying on every dependency under the sun to function.
I agree, I just happen to only have seen (and have to take care of) projects with a gazillion dependencies. Ah, the things you inherit and have no time to clean up.
[Comment removed by author]
This seems incredibly hard to maintain and would not have solved any of the problems in this scenario.
number 1 problem that happened: users tried to get ‘fs’ from npm instead of relying on the local built in version.
In order to find the SHA256 of ‘fs’ in order to list it as a dependency, they would need to find a way to look up the association between ‘fs’-the-package and its correct SHA256 code. This registry of package-to-SHA would be no less vulnerable to the namespace poisoning attack that happened.
number 2 problem that happened: the npm maintainer panicked when a package was reported as spam and deleted it from the archive without following procedures. This is a social issue that has nothing to do with the SHA256.
number 3 problem that happened: the npm maintainer saw that the package was completely pointless and added no value and was easily confused with a critically important system package, and saw further that a namespace poisoning attack was not only possible, but ongoing. He elected to restore the namespace poisoning attack in order to comply with some policy that is clearly dangerously bureaucratic and incoherent, when in fact he should have deleted the package and started taking steps to eliminate namespace poisoning attacks.
on 2 and 3: there has to be some association with an item in the CAS and its proper name. For example, if I want to use the latest ‘crypto’ library, what is the hash I should embed? That association has to be made somehow, and it has to be curatable in case of error or problem. This naming service/directory/whatever is the attackable item, much as DNS and torrent directories are attackable today. Consider the case that A and B both want to create updates to the popular ‘crypto’ package. One of them is a malicious entity, and the other one is a reputable author. Which one should the algorithm trust, A or B? You can’t write code to figure that out.
On 1: I agree.
The great thing that a CAS would provide would be durable package immutability, so that you couldn’t hack an app by changing its underlying dependencies. So people should do that! But it doesn’t solve the problems here.
Consider the case that A and B both want to create updates to the popular ‘crypto’ package. One of them is a malicious entity, and the other one is a reputable author. Which one should the algorithm trust, A or B? You can’t write code to figure that out.
You can’t, but you can avoid updating at all without developer intervention, and thereby not break things that are currently working.
You have to explain why when you write a short comment like this. Merely reasserting yourself won’t help anyone to understand your point of view.
You could rebut /u/markerz’s points and wait for a reply. For example:
Hashing libraries would not be difficult to maintain. Git stores a hash of the project state every time you commit! With enough bits, collisions are incredibly rare (except in the case of a malicious actor).
You would search for packages by name, but depend on them via cryptographic hash ID. “Unpublishing” a package would remove its name from search, but continue to host the package for download by hash ID.