1. 13
    1. 3

      I can’t reproduce this. I’m running the exact same commands locally, and it doesn’t appear to run the code in collections.py. Is this windows specific behaviour or something?

      In general I don’t find the idea that generating docs involves running arbitrary code surprising, build systems often involve running arbitrary code, and generating docs often involves running build systems. It being python, I wouldn’t even be that surprised if the whole thing was implemented via reflection. I was wondering if I could get the same behaviour with something like python3 -m http.server though, because that would be approaching surprising behaviour.

      1. 7

        In general I don’t find the idea that generating docs involves running arbitrary code surprising,

        I misunderstood this originally. The problem is not that it runs arbitrary code to generate the docs (if you’re importing a package, you’re letting the author of the that package [and, by extension, the docs for that package] run arbitrary code anyway, because Python does not have a capability model). The problem is that the docs tool reads files matching a specific name in the current directory and executes them. If someone manages a drive-by download exploit on your browser (I think Safari is still the only mainstream browser that requires you to confirm per site that you want to allow it to download files) then they can drop a file like this in your downloads directory, and if you run the docs command in your downloads directory then you’re owned. Fortunately, it only checks the current directory and not arbitrary parents (as the git vulnerability a month or two ago did), so dropping a file like this in ~/Downloads doesn’t exploit you if you look at docs in ~/Downloads/SomePythonPackage-1.2.3.4/)

        1. 2

          Yeah, what this really boils down to is “if there’s a foo.py in the working directory directory and you run python foo.py or anything else that imports foo, you will get the working directory’s foo.py”.

          Which is one of those deep tensions between making a thing discoverable/learnable (working directory being on the import path is huge for that) versus trying to lock it down as much as possible. And is getting into an area where it’s hard to really have the language stop you – Python could maybe refuse to run if it detects it’s being invoked in a directory matching common download/home dir names, or change import behavior silently, but now you get confusing inconsistency in how it works, and no amount of “are you sure you want to trust this directory?” popups will actually help the people who are most likely to need the help, since they’ll probably just click through those.

          As some folks have noted, Python has a command-line flag that lets you explicitly decide to minimize the import path, which maybe is the way forward for some tutorials and other beginner/first-time materials. Or maybe it’s a thing that needs to be solved at the operating system level.

          Unrelated: this is also why I and several other people strongly advocate for a code repository layout with a src/ directory top-level, and any modules/packages inside that directory. If the modules are top-level, it’s very easy to trick yourself into thinking your packaging process works because you’re likely running it from the root directory, which implicitly puts all that stuff on the import path. Using a src/ (or similar name) directory means you actually have to get the packaging right in order to successfully install/test.

        2. 1

          The problem is that the docs tool reads files matching a specific name in the current directory and executes them.

          I thought the issue was that running python3 -m foo would run foo.py (or similar, don’t know specifics off the top of my head) – that is, this is nothing to do with the docs tool itself. Am I mistaken?

          1. 1

            Running python -m foo will run whatever foo module is found first, starting with the current working directory. The same is true of running python -m pydoc foo.

            The specific “exploit” shown here is more like

            1. Module foo imports standard library module collections
            2. I manage to get a malicious file named collections.py into your current directory and convince you to run python -m pydoc foo
            3. The import collections inside foo gets resolved to the current directory’s collections.py, so that’s the file that gets imported. If it has any import-time side effects, they execute.

            It’s a bit convoluted to actually pull off, because generally you need to convince someone to run python from their downloads directory or something like that.

      2. 1

        It works for me on Debian 11 and Ubuntu 22. And it also works for “python3 -m http.server”.

        Which OS do you use? Can you try running it in a Debian container?

        When I do, it also works for me:

        docker run --rm -it debian:11-slim
        apt update -y && apt upgrade -y
        apt install -y python3
        echo 'print("P0wned");exit()' > collections.py
        python3 -m http.server
        P0wned
        
        1. 2

          I tried it on void linux and mac os. With python3.11 on both systems. EDIT: Oops, python3.10.8 on mac, I checked the version number in a terminal with ssh open (but did fail to reproduce on the actual mac).

          Your docker repro works for me, and installs python3.9. Maybe the behaviour has changed in more recent versions of python?

          1. 3

            Continuing weirdness: I cannot reproduce with Debian’s python 3.9.2, but if I install 3.10.6 (using pyenv) I can finally reproduce what no_gravity is seeing. But I cannot reproduce it using Debian’s system python (which he seems able to do on Ubuntu).

            I’m going to stop messing around with this now, but there seem to be other factors at work here.

            # I installed python 3.10.6 using pyenv and made that the local python
            telemachus(digitalocean) wtf$ python3 --version
            Python 3.10.6
            telemachus(digitalocean) wtf$ python3 -m http.server
            P0wned
            Could not import runpy module
            
            # Now I've go back to the system python3
            telemachus(digitalocean) wtf$ rm .python-version
            telemachus(digitalocean) wtf$ python3 --version
            Python 3.9.2
            telemachus(digitalocean) wtf$ python3 -m http.server
            Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...
            ^C
            Keyboard interrupt received, exiting.
            
        2. [Comment removed by author]

      3. 1

        I also cannot reproduce—not on macOS 12.6.1 with python 3.10.8 and not on Debian 11 with python 3.10.5. (My shell is bash on both systems, though I doubt that matters.)

        I wonder what other variables are at play.

        1. 1

          It works for me with Python 3.10.6:

          docker run --rm -it ubuntu:22.04
          apt update -y && apt upgrade -y
          apt install -y python3
          python3 --version
          => Python 3.10.6
          echo 'print("P0wned");exit()' > collections.py
          python3 -m http.server
          => P0wned
          
          1. 2

            I can finally reproduce this, but only with some pythons and in some cases. I don’t understand this at all. In any case, I’m glad to learn about the larger issue.

    2. 3

      Python 3.11 adds an interpreter flag (-P) and an environment variable (PYTHONSAFEPATH) that you can use to prevent this behavior.

      1. 6

        Yes, if you write the python command by hand and are aware of the issue, you can probably mitigate it.

        The tricky thing is that the python call might be somewhere in a shellscript you use.

        The issue actually came up when an irc user reported his computer goes bananas when he cds into a certain dir. Turned out he was using a tool that executes some Python onevery cd. In that dir, there was a “types.py”. And the shellscript ran some Python that imported “types”.

        1. 2

          Fair enough, though (weirdly) I still cannot reproduce the original example or your example with http.server. It’s not that I disbelieve you, but I don’t understand why I cannot reproduce this.

          1. 1

            How did you install Python?

            1. 2

              I have tried Python installed by Debian (via apt), MacPorts, and pyenv (pyenv on macOS and on Debian). The results are not consistent. That is, sometimes, a pyenv-python ignores collections.py and other times a pyenv-python reads the local file and shows the vulnerability. So far, I have not been able to get a Python installed by Debian or MacPorts to read the local file and show the vulnerability.

      2. 3

        If using an older version of Python (I still run py36-py38), an alternative solution is to add an extra import hook (via sys.meta_path, but an extra FileFinder in sys.path_hooks might also be enough) before the defaults to avoid this behavior.

        I’ve stubbed my toe on this “feature” of Python’s import system enough times to write a custom import hook: https://github.com/ahgamut/cosmopolitan/blob/importer-cosmo/third_party/python/Lib/importlib/_bootstrap.py#L1089

        Why does this happen? When running “import foo”, what python (approximately) does is the following:

        • Python looks through its sys.meta_path entries to see if any of them can handle importing foo (example if foo involves a dynamic shared object, you’d need an entry that calls dlopen).
        • how does the entry check if it can handle the import? It does a few checks (is foo compiled within the default libpython.so or is it frozen bytecode?) or sometimes just attempts the import outright with a try-except fallback
        • the last sys.meta_path entry is usually the one that walks through the files and folders on your filesystem and this is where the whole PYTHONPATH/PYTHONSAFEPATH gimmick comes into play: Python checks the entries of sys.path (which is influenced by the env vars) one-by-one for foo.py, and the first entry in sys.path is usually "." i.e. the current directory. this is why when you have a types.py in your current directory and a script does import types, you get the local file instead of the types module from the stdlib.

        Therefore to avoid this mistake we can add an entry at the start of sys.meta_path or sys.path_hooks that checks the “safe” locations first, before punting to the latter entries that use the local directory

        To avoid the inverse of this mistake (ie I want to import my local foo.py but name clashes with stdlib), I try to import .foo or from . import foo but usually I just rename the local file :P

      3. 2

        The older -I flag is a bit more comprehensive and was updated to imply -P.

    3. 2

      I’m guessing someone has “.” in their PYTHONPATH.

    4. 2

      This is a feature…