1. 13

  2. 7

    Instead of doing everything in a GitHub Action separately, I suggest pre-commit: https://pre-commit.com/ It’s better because:

    • The same tasks can be run locally.
    • The tools are pinned, formatting, linting rules will be the same. When you ran it locally, it will pass on CI.
    • The whole team can run the same tests, no need to wait problems in CI, it’s too late.
    • Checks can run in pre-commit Git hooks, so no mistakes in review/CI etc.
    1. 2

      A better tool for python specifically is tox which is like make/rake specifically for python builds. It’s designed to be the glue between jenkins/etc and your tests. It knows all about virtualenvs and so on.

      I found pre-commit much more invasive in the past - you’re typically deciding everyone’s local development workflow centrally. I also don’t see why there is any connection between git commits/pushes and having a clean build. eg I’m making progress on a bug and want to save my game and git commit fails with “nope! variable name too short!”

      1. 1

        What you described sounds annoying, but on the other hand, I’m yet to personally work with someone that actually does the “commit often and in small chunks” thing IRL. I might do it sometimes but it’s pretty rare.

        It’s also something easy to work around, I think, with pre-commit.

        A much bigger problem I’ve found, specially when working with lots of inexperienced people, is that everyone will develop their own, ad-hoc, often horrible, local workflow, and a lot of time gets spend on code review checking for things that all those tools could catch locally, automatically. I found that that trumps the inconvenience of pre-commit.

        If your working with seasoned devs, though, I can see how your priorities could be different.

        1. 1

          A much bigger problem I’ve found, specially when working with lots of inexperienced people, is that everyone will develop their own, ad-hoc, often horrible, local workflow, and a lot of time gets spend on code review checking for things that all those tools could catch locally, automatically. I found that that trumps the inconvenience of pre-commit.

          That isn’t a usecase for pre-commit per se, that is a usecase for having any kind of branch CI. There are numerous tools that will allow you to run your build on a branch but pre-commit is about tying build to unrelated git activities like commit/push/rebase/etc

          1. 1

            I partially agree. Current teams I work with have a decent CI, so code review time is better spent, but I’m still seeing loads of commits fixing stuff caught by the CI. Which means they are mostly not running things locally. Pre-commit is a convenient interface to give to them to run things locally.

            1. 2

              I read that more as pre-commit is convenient for you because you’ve forced them to run things locally. Which I don’t totally disagree with, but a couple issues I do have are:

              1. Hooking pre-commit seems like the worst time to force the extra step into the process… either you should be doing it as you work, or you should be doing it at the end in handling a PR. Forcing it piecemeal at random times when you happen to commit just seems like the worst option as the work isn’t necessarily fresh in your mind, and you’re in the middle of trying to accomplish something else, but you get pulled into handling the failures immediately.
              2. This approach also assumes you have invested in making the pre-commit environment 100% reliable and consistent with the CI workers… which is the ideal we would hope for, but I don’t always see in practice. I’ve frequently seen pre-commits that rely on something like node/npm blowing up frequently even when you’re not touching (or even know anything about) frontend, so people just disable the pre-commit and push to CI anyway to see if it works there, because who really wants to debug the npm failure of the day if it might be ok on the worker anyway?
              1. 2

                I can see how those can be valid concerns in general.

                They don’t apply to my case, because, as I mentioned, no one I work with really does the “commit often and in small chunks” thing, our projects are all only python, no frontend, and the set of tools we run on CI is not that big.

                I would say that pushing to see if it works on CI is kinda of a antipattern, though, and something that should be fixed somehow. Maybe not with pre-commit, but people shouldn’t feel forced to push to remote and wait a potentially long time just to see if their code work.

                By the way, on making local validation consistent with CI, this is a big beef I have with most CI systems, it’s basically impossible to reuse the CI setup to do validation locally.

                1. 2

                  as I mentioned, no one I work with really does the “commit often and in small chunks” thing

                  I often see the same, but that seems to make it worse in my mind. If you’re frequently committing, then at least anything that fails a precommit should be fresh in your mind and likely relevant to the reason you were committing. When users pile up layers of loosely related changes, possibly over days, they will need to context switch back to something they might have done days ago when some unrelated task calls for a commit.

                  I totally agree that not being able to align local and CI build envs is an issue, and that relying on pushing to CI is an antipattern. But pragmatically those are antipatterns that frequently exist in the wild, and that pre-commit hooks need to account for… and if and when you have resolved those issues, you’ve solved much of what people call out pre-commit as a fix to, so I don’t see a whole lot of added value in it.

                  1. 2

                    You know, now that you expanded on it, I think we configured pre-commit as a pre-push hook, in the project I used before. So, that would solve some your concerns.

                    The other part is that you can run pre-commit on demand, without a commit or push action, which I remember doing often just to run the lintings and stuff.

                    1. 2

                      Yeah, that sounds more palatable… I think we mostly agree on the desired end state, I think I just tend to prefer using explicit call and/or a more real-time filesystem watcher for my local changes.

                      1. 2

                        Yeah, basically, I don’t care how, just give the same validations as CI, in as easy an invocation as possible.

        2. 1

          I share the same feeling regarding pre-commit, everytime I install the hook, I uninstall it in rage after it takes whole seconds everytime I want to do a temp save point or try to amend a commit I know is imperfect, only to have the commit rejected. I should just acknowledge this tool is not compatible with my way of working. To my workflow, pre-commit is like an annoying peer eavesdropping me and poking me everytime I make a typo that I have already noticed myself when the wrong char appearid on screen.

        3. 1

          I’m pretty interested in this approach as well. I think the answer is that most projects should run these sorts of checks both in a hosted, shared, environment, and locally on developer machines regularly. Running all of these locally is absolutely possible, and should be in a team’s pre-commit hook. I tend to just alias black / isort / bandit together and run them regularly while developing if it’s a personal project.

          How are you executing the pre-commit scripts in your CI environment post-push?

          1. 3

            We just setup everything needed for all the pre-commit hooks to run and just invoke their GitHub Action. See an example here: https://github.com/onekey-sec/unblob/blob/main/.github/workflows/main.yml#L32-L33 Pretty simple!

            1. 2

              Side note: Just discovered vulture from your pre-commit, looks super interesting.

              This is a neat approach, and yeah, totally agree with you that if your team is all-in on pre-commit this makes total sense. It gives you one place to configure what’s running (as well as the options for those steps). I need to have a deeper look into using it for my personal projects.

              In the past I’ve seen teams go both ways - especially with things like unittests in pre-commit. In your example you have your checks in the hook but not the tests. That’s probably what I’d end up with too.

              There are those occasional times that you want to be able to commit / push something that’s work in progress without executing all of those checks (--no-verify to the rescue), but really that should be few and far between. I’m not actually sure what the arguments against using a pre-commit hook in a professional setting are these days.

              1. 7

                I’m not actually sure what the arguments against using a pre-commit hook in a professional setting are these days.

                Naaa, hooks are just a wrong solution to the right problem. What you want is to ensure that the canonical version of code (tip of the main branch) possesses certain properties at any time. Where properties are pretty general: unit tests are passing, code is formatted, no vulnerable dependencies are used, licensees are in order, etc. The correct way to check for that is when you update the tip of the main branch on the server which keeps the canonical version of the code. So, to incorporate changes from a feature branch (or just a feature commit):

                • Server which holds the canonical version of code creates a new “merge commit”
                • Server checks that this new commit possesses all the required properties
                • Server atomically updates the tip of the master branch to point to this already checked commit.

                pre-commit is a “client-side validation” version of this workflow — advisory validation which doesn’t enforce anything, and runs on the outdated version of the code (main is probably ahead of the base of the feature branch by the time you finish).

                Practically, pre commits very much get in a way of people who like to do small changes, and who strive to maintain clean git history, because the way you do that is by rewriting many drafts, and pre-commit creates friction proportional to the number of the wip-commits, rather than finished commits.

                Ultimately, you want to enforce properties somewhere between typing things in the editor and getting code in the canonical tree. Moving enforcement point to the left increases friction, moving it to the right increases correctness.

                1. 3

                  Keeping every single commit clean from trivial mistakes is very useful form automated git bisect debugging. A seriously underrated way of debugging regresions.

                  1. 1

                    It’s possible to either:

                    • bisect only through merge commits
                    • rebase and require checks to pass on every commit, and not only for the last one.
                  2. 1

                    I would agree with you in a unitemporal setting. But I think that integration is bitemporal. In particular, whether dependencies are vulnerable is a bitemporal property. This doesn’t negate your point, but it suggests that your “right problem” is a bit too broad, and it’s actually two problems that we’re trying to solve:

                    1. whether the code integrates correctly for the developer’s environment at time of authorship
                    2. whether the code integrates correctly for the user’s environment at the time of installation
              2. 2

                I personally write all checks as tests using the standard testing framework of a particular language. So it’s, eg, cargo test locally, cargo test on CI, and, if you enjoy pre-commit hooks, cargo test In the hook.

                If some tests are particularly slow, I h skip them unless RUN_SLOW_TESTS env var is set (which is set in CI).

            2. 2

              Some similar examples: