1. 38

  2. 16

    Even though I love Rust, I am terrified every time I look at the dependency graph of a typical Rust library: I usually see dozens of transitive dependencies written by Internet randos whom I have zero reason to trust. Vetting all those dependencies takes far too much time, which is why I’m much less productive in Rust than Go.

    I try to also use the same level of scrutiny when bringing in dependencies in Rust. It can be a challenge and definitely uses up time. This is why the crev project exists, so that the effort can be distributed through a web of trust. I don’t think it has picked up a critical mass yet, but I’m hopeful.

    Some projects (including my own) have also been taking dependencies more seriously, and in particular, by providing feature flags to turn off things to decrease the dependency count.

    1. 9

      Also more direct tools like lichking which can help you search for deps with licenses you don’t like.

      1. 2

        Indeed. I regularly use that on my projects with more than a handful of dependencies as a sanity check that there is zero copyleft in my tree.

      2. 2

        Some projects (including my own) have also been taking dependencies more seriously

        One of my biggest pet peeves in Rust is duplicate dependencies. Docs.rs is the worst offender I build regularly (we currently compile 4 different versions of syn!) but it’s a problem throughout the ecosystem. I’ve opened a few different bugs but it usually gets marked as low priority or ‘nice to have’.

        Part of the problem is that so few crates are 1.0 (looking at you, rand), but another part is that IMO people aren’t very aware of their dependency tree. I regularly see crates with 150+ dependencies and it boggles my mind.

        Hopefully tools like cargo tree, cargo audit, and cargo outdated will help but there still has to be some effort from the maintainers.

      3. 10

        On the rust side I use cargo-deny to block unwanted licenses. It’s great.

        1. 2

          This is great and should exist for every tool. Next, generalize it and allow random code quality and open/closed PR metrics and anything else you can think of…

        2. 3

          Meh - this should be automated. License change is a breaking semantic change in a library. There are finite licenses and known incompatibilities like this that are able to be modeled.

          1. 4

            There are finite licenses

            Not really. Some people write their own n-clause BSD license or so. It’s often a matter of changing a few words.

            1. 11

              An automated tool should flag this for manual review and probably removal, because making your project depend on some jackass’s untested, non-peer-reviewed NIH legal code is usually a terrible idea.

              1. 3

                I think without intending to, you just proved my point. For a project that writes their own n-clause BSD license, these will inherently be dissimilar enough when looked at through automated tooling to trigger a build / package fail. This is good.

                1. 2

                  I agree. I mean I should actually do this. I don’t think I have already seen this thing automated in my life and it is a shame. Licence changes in dependencies are probably rather uncommon but the risk is high.

                2. 2

                  There are finite licenses that we should consider using…. I’m not saying one shouldn’t use a n-clause BSD license, but that one should have a relatively small number of them and they should be rejected by default, and later accepted as a case by case exception.

                  1. 1

                    they can be grouped by effect

                  2. 4

                    As he mentions, license compatibility is part of what you look for, but definitely not all. Can’t automate that. And, as the sibling mentions, licensing terms are absolutely not finite.

                    1. 2

                      There are automated tools to aid license review when doing packaging for Debian but they don’t remove the need for a thorough visual inspection.

                      The licenses are then summarized in a standard format in the debian/copyright file.

                      Unfortunately upstream developers and other distributions often don’t care about providing such clear licensing information.

                    2. 2

                      Something about this article itched at me, and on the orange site user danShumway said this which hit the nail on the head for me:

                      If you’re looking at [another] ecosystem and saying, “the number of dependencies is problematic because it takes a long time to review them”, I agree with you. If you’re looking at the Go ecosystem and saying, “there are fewer dependencies, so I don’t need to review them”, then that’s a security antipattern.

                      For example, the Rust standard library was kept small by design because they acknowledge that something in the standard library it shouldn’t be “trusted” just for simply being part of std.

                      1. 1

                        I think it depends what sort of security you’ve hoping the standard library gives you.

                        A standard library might make poor crypto choices, do funny things with deserialization, or have any number of other security sensitive code that can be a risk, so the presence in the standard library isn’t anything like a full seal of approval.

                        I still think that presence in most languages’ standard library give you some assurance against the kinds of “supply chain” attacks we’ve recently seen in NPM and PyPi. For many libraries, those supply chain attacks are the primary security issue the library raises.

                        1. 2

                          I still think that presence in most languages’ standard library give you some assurance against the kinds of “supply chain” attacks we’ve recently seen in NPM and PyPi.

                          I don’t think you should be conflating those last two.

                          The thing people seem to worry about in a “supply chain” attack is that they’re depending on a particular package – let’s say foolib – and one day an evil person compromises the package-registry account of foolib’s maintainer, and uploads new packages containing malicious code, which are then pulled automatically by the build processes of people depending on foolib. I believe that has happened a few times to packages on npm.

                          But as far as I’m aware, that’s not a thing that has happened to PyPI. All the alleged “supply chain attack” stories I’ve seen about PyPI involved typosquatters who’d register a similarly-named package and hope to trick people into installing it instead of the real thing. So, say, someone registering foo-lib or foo-library and hoping you’d not look too closely and conclude their package was what you wanted. While that’s a thing that definitely needs to be policed by the package registry, anyone with foolib in their dependency list is never at risk of receiving a malicious package in that case. Only someone who adds the malicious typosquat as a dependency is in trouble.

                          (it’s also something difficult to police in an automated way, because it’s somewhat common for package registries to end up with multiple similarly-named but legitimate packages)

                          1. 1

                            Thanks, I thought PyPi had both types of attacks, but it appears it’s only been typosquatting.