1. 12

Today we’re publishing another Libraries.io open data release with over 311 million rows of metadata about open source projects and the network of dependency data that connects them all.


  2. 4

    This project, like others I’ve found, doesn’t seem handle to Python dependencies properly. For example, Flask 0.12 is listed as having no dependencies, but actually it depends on Werkzeug, Jinja2, click, and itsdangerous. [1]

    I think this is because these sites are grabbing package JSON data from Pypi, but many (most?) packages don’t declare their dependencies there. As far as I know the only way to accurately resolve dependencies for Python packages is

    1. Grab the Wheel (if there is one) and inspect the metadata.json file, otherwise
    2. Download the source distribution and actually install the package. This is very often necessary, as many projects (or older releases) don’t have a Wheel.

    I’ve been working on a project that actually has the correct dependency graph for Python libraries. I’ve had to follow the approach above. It’s not quite ready to show the world, but I’m hoping it’ll be interesting for people.

    [1] https://libraries.io/pypi/Flask/0.12

    1. 2

      Yeah python dependencies are not easily machine readable, and for the moment I’m trying to avoid executing setup.py files downloaded from the internet on the Libraries.io servers, any help contributing better python support would be great.

      1. 3

        I’m trying to avoid executing setup.py files downloaded from the internet on the Libraries.io servers

        That’s wise - I’ve seen all sorts of shenanigans in those files. Just importing some of them causes attempted sudo operations.

        1. 2

          I’m not particularly familiar with the python world, but this sounds like the perfect use-case for containers, no?

          Note I said containers not “docker”. I believe what you want is a quick “spin up $distro, install $package, analyse installed deps” flow, which imo would suit lxc/lxd perfectly.

          1. 2

            I’ve hacked something similar together over here: https://github.com/librariesio/pydeps

            1. 2

              You’re right - I solved this issue by using docker

              1. 2

                Yup. The place I work at (shameless plug: https://fossa.io) does this using ephemeral Docker containers.

                We scan projects to check if they’re compliant with the licenses of their open source libraries. To do this, we need to compute the dependency graph of a project. For most build systems (the exceptions are usually NPM and Golang tools), this means running a full build to execute any arbitrary build scripts.

                If you only do static analysis of package manifests, you tend to overreport and underreport – you’ll miss packages brought in by build scripts, and you’ll have extra packages (or extra versions of packages) that are included in the manifest but might be unused/optimised out by the build system/brought in by version constraint solver weirdness.