1. 8
  1. 5

    This is not wholly unreasonable, but there are some downsides too, and I don’t see it as so clear-cut.

    The “refactor” and “change” phases (or hats) aren’t always completely unrelated. Let’s say you want to add feature A and the current design makes this hard or impossible; a fairly common motivation for refactoring, especially in non-public modules. Okidoki, so let’s refactor things to accommodate feature A! The problem with having a completely clean separation between the “refactor” and “implement feature A” stages is that you don’t necessarily know if your refactor actually makes sense unless you also implement feature A, especially if A is somewhat complex. you may refactor too little, or too much. Only by actually implementing feature A can you truly understand the problem at a code level, and only by actually implementing it can you truly know if the API you thought of actually works well.

    So you kind of need to do both at the same time, at least some of the time – depending on the complexity of the API, complexity of Feature A, your familiarity with the code and problem, team size, frequency of changes, etc. Sometimes it makes sense to then later split out your branch a bit (I’ve done so many times), other times … I don’t know; not so much, where the amount of required time and effort is disproportional to the (potential) benefits.

    I also think there’s a lot of overhead in this approach in general. His example with different “refactor” and “change” hats and switching between them seems a lot of kerfuffle, and I don’t really see the benefit for simple things like “rename a badly named variable” or “add a guard clause”, especially if it’s in the actual code you’re changing.

    Another thing this article doesn’t cover: make sure you actually have a reason to refactor things. “I don’t like the way this is written” is a good reason for your hobby project but not a reason for actual professionally developed software involving production software. At least half of the refactors I’ve seen are basically pointless rewriting of working code but in a different way because “you ought to do X, Y, Z” without any actual objective reasons. This kind of refactors are not just a waste of time, but can actually make the code worse by accidentally introducing regressions, improving churn for the rest of the team (“wait, what happened here? How does this work now?”), etc.

    1. 2

      The problem with having a completely clean separation between the “refactor” and “implement feature A” stages is that you don’t necessarily know if your refactor actually makes sense unless you also implement feature A, especially if A is somewhat complex.

      I’ve also had this experience. In some cases it’s really best to just do the refactor and implementation in one go, but I’ve found that if it’s at all possible, it’s often better to do it incrementally with interactive rebase, where you go back and forth between the clean refactoring branch (or commit) and the feature implementation branch. This certainly eases the code review process and if you ever find a bug, it can be very helpful for a git bisect as well, as the two changes are more self-contained.

      At least half of the refactors I’ve seen are basically pointless rewriting of working code but in a different way because “you ought to do X, Y, Z” without any actual objective reasons.

      Ugh, don’t get me started…

      1. 1

        Yeah, I’ve often done the “write a massive branch with all sorts of things and then split that up in more bite-size branches/patches”-thing many times.

        Often that’s worth it, even if just to get a clearer idea of the changes for yourself. Sometimes, not so much. It’s a very different workflow than proposed in the article, where you’re supposed to stop, put on the “refactor hat”, make the refactor, commit that, go back to your “change hat”, etc. If that works for you: great. But I would rarely do it like that.

      2. 1

        Conversely… I have been doing battle with a C module that when I started was 4k lines long, massive cyclomatic complexity, huge functions, massive amount of global variables, massive fanout, no encapsulation, no unit tests, no integration tests, no asserts, poorly commented, written by 11 people… all with an attitude “I don’t have time or a good reason to clean this up now”.

        That may have been survivable if it was a leaf node… but it sits on top of three different versions of a vast black box library that drives two different hardware devices.

        Unsurprisingly I tend to get a bit grumpy.

        1. 2

          Assuming that’s referring to the last paragraph, it all depends on the code and motivations for the refactor of course. In general I think refactors should have a clear goal in mind:

          • “It will be hard to add feature X without it”
          • “This code is a frequent source of bugs”
          • “This code is hard to understand, and this is a problem”
          • “This code is hard to modify, and we need to modify it frequently”

          Some subjectivity is involved of course, especially with the third and fourth items. But it’s still a fairly straight-forward calculation; something like: value = (time_saved - time_on_refactor) * risk.

          I’ve seen people refactor code that was admittedly less-than-great, but had nonetheless been running in production for years without major problems, was feature-complete, and didn’t cost the team any time to maintain. Someone needs to fix a minor bug, add a minor feature, or just has to use the module and they will spend hours, days, or even weeks refactoring the damn thing for basically zero benefit. But it does have a risk of regressions, other people who were used to the code will no longer understand the new version, etc. In some of the most extreme cases I’ve seen refactor code that was actually perfectly fine, but just didn’t fit some notion of how things “ought to be” or because “it didn’t look nice” or whatnot.

          If you need to modify it weekly then it’s a different story; this code is active and “alive” and it pays to invest in it.

      3. 1

        If you work with something like Mercurial evolution you can reorder your branch to be….

        • Commits that only add observability. (eg. logging)
        • Commits that only extend unit tests.
        • Pure micro refactorings.
        • Changes in behaviour.

        Sometimes adding a refactoring makes you nervous about your test coverage so you extend it…. unless you add it before all your refactorings, it may find breakage in one of your earlier refactorings.

        By placing the test coverage extension before all the refactorings (and after observability changes) you can both…

        • Bisect to find the refactoring that broke things.
        • Diff the output logs to see exactly where and how it broke.