1. 32
  1.  

  2. 4

    Wow. My first thought was: “who does this?” Then I realized how easy it is to do by accident: mock up some code with hard coded secret, read it from elsewhere later, and accidentally commit the older bytecode.

    This is really a study in sane defaults. Putting .pyc files alongside source is not a sane default. Why not stuff them into a profile dir controlled by Python?

    1. 2

      This is really a study in sane defaults. Putting .pyc files alongside source is not a sane default. Why not stuff them into a profile dir controlled by Python?

      Maybe because that’s traditionally where compilers put binaries built from source?

      I’d argue that it’s on the developer to understand the tools they are using (source control) and properly use features in that tool (e.g. .gitignore) to prevent leaking secrets.

      1. 3

        I think it’s hard to argue that it’s both convention and not a problem when C and C++ projects have largely moved on from doing this (and most encourage completely out of tree builds), and that Python 3.x moved .pyc files to a separate __pycache__ folder.

        1. 2

          Maybe because that’s traditionally where compilers put binaries built from source?

          In the case of a compiler, the user is explicitly requesting to generate an executable to run. In the case of Python, the interpreter just spits them out alongside source, and we have to tell users to ignore them. Intent makes all the difference here.

          I’d argue that it’s on the developer to understand the tools they are using (source control) and properly use features in that tool (e.g. .gitignore) to prevent leaking secrets.

          Agree, though I see Python’s .pyc files as incidental complexity which shouldn’t be foisted on users.

          1. 2

            I think you could just as easily blame git for not having sensible defaults for excluding compiled files by default.

            1. 1

              The problem with this is that git doesn’t (and arguably shouldn’t) know what a compiled file looks like for every single language in existence. It’s up to somebody to tell git that using .gitignore.

              I don’t know whether there is a common way to set up a new python project (like rust’s cargo init or Haskell’s cabal init) which could be modified to (offer to) create a sensible python .gitignore as part of the process.

              Btw, you can have a global .gitignore if you know you never want to accidentally commit a certain file pattern:

              git config --global core.excludesfile '~/.gitignore'
              

              This might allow you to have a “sensible default” on your machine at least.

              1. 2

                I don’t know whether there is a common way to set up a new python project

                There’s nothing built in to the language’s packaging toolchain to do this. There are popular third-party options like cookiecutter. GitHub itself also provides stock .gitignore files for many languages, including Python, and supports creating a repository structure from a template.

                1. 1

                  The global .gitignore is what I’m talking about. I’m sure there’s a reason, but why isn’t that set up with “common” (as defined by whoever makes a PR) compiled filename wildcards? Seems like the problem here is git is overly eager to commit everything (which makes sense!), but should it also not be git’s responsibility to make a reasonable effort to protect users from themselves?

                  Isn’t there something about how with great power comes great responsibility?

                  1. 1

                    Git is a tool, and it assumes that the user knows what they’re doing. To me, what you’re suggesting sounds like a request for a hammer which refuses to hit screws. Git offers many ways to stage and commit changes, most of which aren’t overly eager to commit everything. If users are learning to use things like git add -A by default, that’s more of an education problem. Yes, git could try to make it harder to do the wrong thing, but at a cost of removing features that can be useful. A better option could be to encourage new users to use one of the many git GUIs or TUIs, which make it more obvious what’s being committed and how to pick and choose which changes to commit.

                    To expand on this a bit, not excluding any files (except .git) by default is the “least surprising” option. If you ask git to add all files, it adds all files. If you asked it to add all files and it missed out one or two because it thought they looked like build artefacts from some obscure language you’ve never heard of, that would be confusing and frustrating.

                    1. 1

                      I agree with all of that, but I also find it strange that you put the onus on setting up a proper .gitignore file on python (or any other language’s tooling).

                      I’d argue that since git is not designed to hold build artifacts it is surprising that it does not, by default, attempt to prevent them from being committed. Sure, you can then argue about common vs obscure but that’s a bit of a silly argument since you could just say it’ll only cover languages that are popular enough by some arbitrary metric.

        2. 4

          Good one. The joys of git add . without setting up .gitignore…

          1. 2

            We should also do a better job of educating devs to review all staged changes prior to committing. It’s not enough to git add ., you also need to check it to ensure you agree that everything being committed is absolutely necessary.

            1. 2

              I’ve disabled git commit‘s -a parameter on all my machines. It’s just too dangerous/too easy to screw things up with it. git add $dir is in the same category, IMHO.

              1. 1

                How do you disable the -a parameter?

                1. 2

                  I’ve put this into my .bashrc:

                  git() {
                  	for arg
                  	do
                  		if [[ $arg == -a* || $arg == -[^-]*a* ]]
                  		then
                  			echo "DO NOT USE -a!"
                  			beep -l 350 -f 392 -D 100 -n -l 350 -f 392 -D 100 -n -l 350 -f 392 -D 100 -n -l 250 -f 311.1 -D 100 -n -l 25 -f 466.2 -D 100 -n -l 350 -f 392 -D 100 -n -l 250 -f 311.1 -D 100 -n -l 25 -f 466.2 -D 100 -n -l 700 -f 392 -D 100 -n -l 350 -f 587.32 -D 100 -n -l 350 -f 587.32 -D 100 -n -l 350 -f 587.32 -D 100 -n -l 250 -f 622.26 -D 100 -n -l 25 -f 466.2 -D 100 -n -l 350 -f 369.99 -D 100 -n -l 250 -f 311.1 -D 100 -n -l 25 -f 466.2 -D 100 -n -l 700 -f 392 -D 100
                  			return 1
                  		fi
                  	done
                  	command git "$@"
                  }
                  

                  The beep invocation is of course entirely optional ;-)

            2. 3

              In the demo repo linked from this post, I was able to extract the secret using just strings(1) . No decompilation required!

              1. 1

                Would probably be a good idea to do exactly the same thing again but with .pyo files. Just in case someone was using the -O flag in a project with a secrets.py file.

                1. 1

                  That’s a great thought! A preliminary search didn’t yield anything interesting, but playing around with other file names/extensions might turn something up: https://github.com/search?q=filename%3Asecrets.pyo