1. 75
  1.  

  2. 24

    Since it’s a medium post with a clickbait title here’s a TLDR:

    While attempting to hack PayPal with me during the summer of 2020, Justin Gardner (@Rhynorater) shared an interesting bit of Node.js source code found on GitHub.

    The code was meant for internal PayPal use, and, in its package.json file, appeared to contain a mix of public and private dependencies — public packages from npm, as well as non-public package names, most likely hosted internally by PayPal. These names did not exist on the public npm registry at the time.

    The idea was to upload my own “malicious” Node packages to the npm registry under all the unclaimed names, which would “phone home” from each computer they were installed on.

    Apparently, it is quite common for internal package.json files, which contain the names of a javascript project’s dependencies, to become embedded into public script files during their build process, exposing internal package names. Similarly, leaked internal paths or require() calls within these files may also contain dependency names. Apple, Yelp, and Tesla are just a few examples of companies who had internal names exposed in this way.

    This type of vulnerability, which I have started calling dependency confusion, was detected inside more than 35 organizations to date, across all three tested programming languages.

    Feels weird and scary that this had always been possible! Another incident to add to the “package management is solved” meme. Great article.

    1. 10

      public packages from npm, as well as non-public package names, most likely hosted internally by PayPal.

      Even if you’re not using npm’s organization feature to host your modules, you probably want to use names scoped to an npm account or organization you control, so others can’t publish packages with matching names to the public registry.

      That said, dependency managers probably shouldn’t be running arbitrary code on users machines during installation, as in the case with the preinstall used in this example. Unfortunately, this was reported back in 2016 (VU#319816) and nothing came of it.

      1. 8

        I don’t really know how anything about npm dependency fetching works, but shouldn’t the logic be, “Do we have an internal package called ‘foo’? If not, look for public packages called ‘foo’.”? Based on the article description it sounds like it must be doing, “Is there a public package called ‘foo’? If not, look for an internal one”. Is this really how it works?

        1. 7

          npm has a limited concept of different registries. It fetches all packages from the one set in the global configuration file, a environment variable, or a CLI flag. The exception is scoped modules (modules whose names look like @mycompany/foobar), where each scope (the @mycompany part) can be assigned a registry.

          If you pay npm, you can set scoped packages published on their registry to only be installable by users logged into your organization.

          Before scoped modules were added to npm, the best you could do is create unscoped packages that didn’t exist, and point npm at a proxy that decided what backend to fetch a package from based on the requested name. A common implementation checked an internal registry first, and if it didn’t exist, then it fetches from the public registry.

          The author of this post provides examples of internal modules being unscoped, so I’m assume these companies are relying on developers connecting to a proxy to fetch the correct dependencies. I could easy invision scenarios where new developers, CI systems, IDEs are improperly configured and fetch those names instead from the public registry, thus this vulnerability.

          1. 3

            If the package exists on both [the internal and public], it defaults to installing from the source with the higher version number.

            The kicker there being you can make an arbitrarily higher-versioned package e.g., 9000.0.1 to force the public (malicious, in this context) dependency. The article also describes that same behavior in Artifactory which is popular within companies to host various internal packages (including npm):

            Artifactory uses the exact same vulnerable algorithm described above to decide between serving an internal and an external package with the same name.

            I think for npm, using the save-exact feature would be a fix—and imho a sane default—but I’m not 100% certain.

            1. 2

              I’m not sure this is accurate, or at least it wasn’t the implementation of any proxies I worked on or with back when I was still working on npm.

              npm would ask the proxy for information about a package name. All the ones I used would query that metadata from an internal version, and only if it returned nothing, did it fetch information from the public proxy.

              This implementation choice was made in the proxies to allow teams to hold back or override open source modules they used (especially useful with deeply nested dependencies before lockfiles) and to avoid situations where someone else claimed the same name to try to get you to fetch it instead (this being before scoped modules).

              I haven’t been in the Node.js community for about 4 years now, and have never had access to Artifactory, so I can’t confirm or deny what implementation they’re using now. It would be a shame if they forged ahead without the security concerns open sourced alternatives had long considered.

              1. 1

                I’ll be honest: not sure on the technical differences to how Artifactory works compared to the proxies you worked with. When I’ve previously used Artifactory (as a humble user) it’s effectively worked as a pull-through cache of sorts: serve a package that exists internally then fall back to the public if necessary. What comes to mind as of recent was the change by Docker Hub that rate-limited requests.

                Anyways, your reply made me think more specifically about the Node.js/npm vector from the article:

                Unless specified otherwise (via --registry or in a .npmrc) then the default (public) registry is used. Given that, I think it’s not out of question for a npm install acme-co-internal-package to be blindly ran which would hit the public (malicious) package if there’s no internal registry specified. Just my $0.02.

                1. 2

                  Yeah, that’s the conclusion I wrote up thread.

                  I could easy invision scenarios where new developers, CI systems, IDEs are improperly configured and fetch those names instead from the public registry, thus this vulnerability.

                  1. 1

                    D’oh, I missed that. Just like the pesky step in a project’s README that tells (hypothetical) you to set the internal registry. ;^]

                    I’m sure it’s a curious sight internally at npm to see all the 404ing requests for packages—many of which exist in an internal registry.

        2. 3

          The article is (intentionally, I believe) vague about it, but I’m curious how they came across all the dependency declaration files in the first place.

          common for internal package.json files, which contain the names of a javascript project’s dependencies, to become embedded into public script files

          I don’t quite follow. Anyone have insights on the semantics of “leak” in this context?

          1. 1

            I think they might be concatenated into the production minified js file due to a misconfigured js build pipeline, but that’s just a guess.

        3. 5

          The pun is perfect, although the author doesn’t mention it; this is a confused deputy attack!

          1. 1

            Hm is it? Which process is the confused deputy? npm?

            I’m not sure capability-based security helps here. I think the fix is in npm’s resolution logic. And you could write the same bug in a capability-based system too.

            My impression is that the whole package ecosystem is “confused” with respect to security (which is why I almost never use any of this stuff; I try to download tarballs directly from PyPI :) ). I’m not sure it’s specifically a confused deputy.

          2. 1

            How long until vendoring of dependencies becomes the accepted norm? The number of issues it solves/reduces greatly is significant.

            1. 3

              You need to keep updating the vendored dependencies (otherwise you’ll have known vulnerabilities baked into your codebase), but as soon as you start updating them, you’re back to square one with the confused fetching logic. To me it sounds like the same issue, but with extra steps.

              1. 5

                If your developers are blindly checking in changes, and then deploy thing those changes without review, then yes, its the same issue.

                Half the point of vendoring your dependencies is that hopefully the person who updates them, pays the slightest bit of attention to what they’re committing, and if they don’t, hopefully the person reviewing the change will be paying a little bit of attention.

                If neither of those things is happening, you have bigger problems than malicious dependencies mate.

            2. 1

              This is not a new attack vector, it has been documented for several years. What’s interesting is the fact that these companies allow their devs to install anything via their package management tool.