1. 32
  1. 32

    When an error in your code base can take down millions of users who depend upon it for vital work you should

    1. Have good CI
    2. Have extensive tests
    3. Make small changes at a time
    4. Have at least one set of extra eyes looking at your changes
    1. 15
      1. Make use of language features that push you towards correctness, for example static typing.
      1. 8

        I find it shocking how many people love “dynamic languages”

        1. 7

          I don’t. There’s a lot of neat tricks you can do at runtime in these systems that would require 10x more work to do at build time, because our build tools are awful and far too difficult to work with. Problem is that we only have the build-time understanding of things while we’re actually programming.

          Don’t get me wrong, I disagree with taking this side of the trade-off and I don’t think it’s worth it. But I also realise this is basically a value judgement. I have a lot of experience and would expect people to give my opinions weight, but I can’t prove it, and other rational people who are definitely no dumber than me feel the opposite, and I have to give their opinions weight too.

          If our tooling was better (including the languages themselves), a lot of the frustrations that lead people to build wacky stuff that only really works in loose languages would go away.

          1. 7

            I don’t, because I used to be one of those people. Strong type systems are great if the type system can express the properties that I want to enforce. They’re an impediment otherwise. Most of the popular statically typed languages only let me express fairly trivial properties. To give a simple example: how many mainstream languages let me express, in the type system, the idea that I give a function a pointer to an object and it may not mutate any object that it reaches at an arbitrary depth of indirection from that pointer, but it can mutate other objects?

            Static dispatch also often makes some optimisations and even features difficult. For example, in Cocoa there is an idiom called Key-Value Coding, which provides a uniform way of accessing properties of object trees, independent of how they are stored. The generic code in NSObject can use reflection to allow these to read and write instance variables or call methods. More interestingly, this is coupled with a pattern called Key-Value Observing, where you can register for notifications of changes before and after they take place on a given object. NSObject can implement this by method swizzling, which is possible only because of dynamic dispatch.

            If your language has a rich structural and algebraic type system then you can do a lot of these things and still get the benefits of a static type checking.

            1. 2

              Regarding your example, honestly I am not 100% sure that I grasp what you are saying.

              In something like C++ you can define a constant object and then explicitly define mutating parts of it. But I don’t think that quite covers it.

              I have enjoyed some use of Haskell a few years back and was able to grasp at least some of it. But it gets complicated very fast.

              But usually I am using languages such as c# and typescript. The former is getting a lot of nice features and the latter has managed to model a lot of JavaScript behaviour.

              But I have no problem admitting that type systems are restrictive in their expressibility. But usually I can work within it without too many issues. I would love to see the features of Haskell and idris, and others become widely available - but the current languages don’t seem interested in that wider adoption.

              1. 3

                Regarding your example, honestly I am not 100% sure that I grasp what you are saying.

                In something like C++ you can define a constant object and then explicitly define mutating parts of it. But I don’t think that quite covers it.

                I don’t want an immutable object, I want an immutable view of an object graph. In C++ (ignoring the fact that you can cast it away) a const pointer or reference to an object can give you an immutable view of a single object, but if I give you a const std::vector<Foo*>&, then you are protected from modifying the elements by the fact that the object provides const overloads of operator[] and friends that return const references, but the programmer of std::vector had to do that. If I create a struct Foo { Bar *b ; ... } and pass you a const Foo* then you can mutate the Bar that you can reach via the b field. I don’t have anything in the type system that lets me exclude interior mutability.

                This is something that languages like Pony and Verona support via viewpoint adaptation: if you have a capability that does not allow mutation then any capability that you load via it will also lack mutation ability.

                But usually I am using languages such as c# and typescript. The former is getting a lot of nice features and the latter has managed to model a lot of JavaScript behaviour.

                Typescript is a dynamic language, with some optional progressive typing, but it tries really hard to pretend to be a statically typed language with type inference and an algebraic and structural type system. If more static languages were like that then I think there would be far fewer fans of dynamic languages. For what it’s worth, we’re aiming to make the programmer experience for Verona very close to TypeScript (though with AoT compilation and with a static type system that does enough of the nice things that TypeScript does that it feels like a dynamically typed language).

                1. 1

                  I really like the sounds of Verona.

              2. 1

                Strong type systems are great if the type system can express the properties that I want to enforce. They’re an impediment otherwise.

                It’s not all-or-nothing. Type systems prevent certain classes of errors. Tests can help manage other classes of errors. There’s no magic bullet that catches all errors. That doesn’t mean we shouldn’t use these easily-accessible, industry-proven techniques.

                Now, static typing itself has many other benefits than just correctness–documentation, tooling, runtime efficiency, enforcing clear contracts between modules being just a few. And yes, they do actually reduce bugs. This is proven.

              3. 4

                We have somewhat believable evidence that CI, testing, small increments, and review helps with defect reduction (sure, that’s not the same thing as defect consequence reduction, but maybe a good enough proxy?)

                I have yet to see believable evidence that static languages do the same. Real evidence, not just “I feel my defects go down” – because I feel that too, but I know I’m a bad judge of such things.

                1. 1

                  There are a few articles to this effect about migrations from JavaScript to TypeScript. If memory serves they’re tracking the number of runtime errors in production, or bugs discovered, or something else tangible.

                  1. 1

                    That sounds like the sort of setup that’d be plagued by confounders, and perhaps in particular selection bias. That said, I’d be happy to follow any more explicit references you have to that type of article. It used to be an issue close to my heart!

                    1. 1

                      I remember this one popping up on Reddit once or twice.

                      AirBNB claimed that 38% of their postmortem-analysed bugs would have been avoidable with TypeScript/static typing.

                  2. 1

                    Shrug, so don’t use them. They’re not for everyone or every use case. Nobody’s got a gun to your head. I find it baffling how many people like liquorice.

                    1. 1

                      Don’t worry, I don’t. I can still dislike the thing.

                2. 6

                  And if you have millions of users, you also have millions of user’s data. Allowing unilateral code changes isn’t being a good steward of that data, either from a reliability or security perspective.

                3. 16

                  I believe in mandatory reviews, but I usually communicate what I expect from my reviewer when I request a review. Likewise as a reviewer I ask what’s expected of me when I don’t already know. I don’t think mandatory reviews create a culture of carelessly tossing reviews over the wall, and if you stop doing mandatory reviews I don’t think people will magically try harder.

                  One policy doesn’t make or break a culture of developing quality code.

                  1. 12

                    I worry that a standard process gets in the way of building a flexible culture for collaboration.

                    This gets to the heart of my problems with the piece, I think.

                    Lack of process does not scale: in an ideal world everyone would be disciplined in their approach but as the size of an organisation increases you need to agree on what “disciplined” looks like.

                    Even for small projects, I have on days and off days and I appreciate technology that can prompt me towards the better approach. The way I often phrase this in discussions about tooling is: it should be harder not to use it.

                    That comes with a corollary: you should still be able to not use it! No process is sacred or perfect for every situation and it’s important to be flexible. If you’re bypassing something regularly, maybe it’s time to think about why it’s not working. A good process is one that makes your life easier.

                    Maybe this is what the author is trying to get to, and I’m missing the point. I’m certainly not arguing that enforcing PR review is the only way to do things.

                    But I’m wary of arguing that something is all bad just because we might want to do something different sometimes.

                    1. 3

                      That comes with a corollary: you should still be able to not use it!

                      This is fine in many situations. The problems crop up when developers do not fully understand their commitments. These should always be explicit, but a significant number of organizations work with significantly complicated customers and industries that understanding the domain becomes as a big a challenge as the project. You also have the human elements of being hired into a role that’s different from the one presented, organization changes, etc.

                      The rules and process in an effective organization, where “effective” means meeting customer requirements, needs to have some level of enforcement. If it was purely size then the option to skip it would be more acceptable, but for many situations it isn’t. I’m ignoring a large set of organizations that operate with risk by unknowingly violating regulations, or have just hit a point where their legal department will finally force them to snap to a process.

                      Quality of life is an important consideration for developers and operators. In the current tech market, this group tends to be well positioned to dictate a lot about what a good process means from their perspective. When it comes to a drive towards process is optional, especially as short-hand for developers will opt out of the processes that they feel like, then it starts to challenge our level of professionalism as an industry.

                      1. 1

                        This is very true! Some process should not be skipped - and we’re back to expecting people to use good discipline and judgement, which doesn’t always work.

                        Perhaps in such cases that should read: “you should be able to skip it if you can prove you have a damn good reason for doing so”?

                        1. 4

                          Treating the skippage of very important steps as an incident and holding a plain post-mortem is a good start. So less about the skipper proving themselves, more about exploring the circumstances that lead to skippage.

                          1. 2

                            Perhaps in such cases that should read: “you should be able to skip it if you can prove you have a damn good reason for doing so”?

                            If you act with good intent and break a regulatory requirement, you have still broken a regulatory requirement. Depending on circumstances this is still a serious issue and not infrequently investigation finds that it was avoidable. It is much better to pause and collaborate with others to overcome the impasse even when you are uncomfortable because the consequences are not always immediate, or direct.

                        2. 2

                          I agree. Processes like mandatory code review is important in so far as they document your beliefs and knowledge about what was best at the time the decision was made. Beliefs and knowledge change with evidence and circumstances, and so will your processes.

                          The option to duck out of It – if the situation so demands – is also important to me. I think the only road to long term success is to hire smart people and give them freedom under responsibility.

                          Of course, you could argue that doesn’t scale, and I think that’s not a problem with the principle, but simply a consequence of the fact that human creative work does not scale, period. You can’t construct a well-functioning 500 person product development organisation. In order for it to work at all, you’d have to extinguish many of the things that make creative teamwork powerful.

                          Of course, the alternative might be better: put together the 500 people into small mini-orgaanisations that are as independent as possible, and essentially run a little company inside the company (driving stuff from user need to sunset), but with goals determined by larger coherent strategic initiatives.

                      2. 15

                        Whenever I see people hating on pull requests and code reviews, I question the people and culture of their teams. I’ve been doing the PR+CR thing for many years in more than one company, and have never really thought negatively of PR+CR. If hiring is good, the people are good, and the culture is good, then I think PR+CR is not only fine, but is beneficial to a dev team.

                        When I say “good”, I mean things like: everyone assumes positive intent by default; no quarelling over style; people know that blocking a PR is a heavy hammer that should be used sparingly; politeness; respect; humility. Stuff that just comes naturally for a company or department that has great culture.

                        Obviously, if your team’s PRs are a minefield of finger pointing, accusation, ad hominem, style wars, frequent PR blockage, intelligence insulting (even unintentional), and arrogance… nobody would like that. But that’s a people problem, not a process problem.

                        1. 2

                          I remember using tfs and what code reviews were back then was nightmarish compared to PRs. It does help that I have a great team to interact with certainly.

                          Stuff like style is enforced by eslint and similar technologies - reduces the chances of simple formatting being on the diff.

                        2. 10

                          Sometimes there’s acronym requirements that require review on every code change.

                          1. 5

                            +1. If you work at a larger organization, you are likely bound to “acronym requirements” which seek to prevent unilateral access by employees.

                            Another set of eyes is a partial deterrent to insider attacks. That assumes you closed all the obvious holes first, however.

                          2. 9

                            I want to agree with this article, since I’ve seen a lot of cases where mandatory code reviews introduce a bottleneck that isn’t necessary and I honestly don’t have a clearly articulated policy in mind that could be implemented to avoid those cases, but this bit fully made clear my problems with the approach detailed here:

                            Should no-one be responsible? Yes! I am responsible for my code change. I don’t dump that responsibility on you as a reviewer. Taking full responsibility includes assessing my own confidence in the change and enlisting the amount of help from others as necessary.

                            Any of us who have been involved in infrastructure or ops or whatever you want to call it understand well that locating responsibility in the individual, instead of in the system, is a recipe for failure. Any code change that makes it into production should become the collective responsibility of the team in charge of that code once it does, and allocating responsibility in the way suggested here is a great way to incentivize a culture of blame in contrast to a culture of safety and self-reflection. In the absence of a win-win solution to the issues described here, I’ll happily accept code reviews as a bottleneck as an alternative to what would likely happen otherwise.

                            1. 4

                              No reviews makes sense in a situation where everyone is equally competent and in good faith working towards a common well-defined goal with no mistakes.

                              In all other scenarios, I think an extra set of eyes before code goes into the codebase is a nice thing to have, and if people rely on it, a necessary thing to have.

                              Code tests can help alleviate the stress on a reviewer, but there’s a lot of naughty things you can do to a codebase without failing a test.

                              1. 2

                                One of the more interesting ideas I’ve seen is “optimistic merging” (I may be paraphrasing the name).

                                While you’re developing, merge to master. Once it’s in master, someone comes along and reviews it. If it needs small changes, they can just make them themselves (merging the changes, and asking you to review them), if it needs large changes, they can ask you about it. If it ends up needing reverting, that’s ok, version control knows how to revert commits.

                                I haven’t seen it implemented in any major projects, but it seems like a pretty reasonable model to me. Higher trust than the typical review than merge model, but that trust isn’t unwarranted, and not that high risk since code is still eventually reviewed.

                                1. 5

                                  I dislike this for a variety of reasons:

                                  • Once it’s merged, there’s no process to ensure followup, whereas while it’s still in a PR the submitter has an incentive to keep things moving (and there’s a way of even knowing that something still needs review!)
                                  • Reverts muck with version history. They’re easy to do, sure, but they make code archaeology more difficult.
                                  • If someone is sloppy, it should be them making the small fixes, not the reviewer; otherwise you can easily end up with resentment.

                                  …and that’s leaving aside the question of whether it’s even safe in the first place.

                                  1. 4

                                    That said, I also don’t like mandatory reviews, in the strictest implementation. I’ll frequently leave some comments about typos and other simple mistakes and approve, and then my coworker comes along and makes their fixes and merges without requiring a re-approval. I trust them to know what’s worth getting a re-review on—and, in fact, a number of times I have been asked to come back and take a look over the newer changes.

                                    (Edit: Github has a feature where you can require not just an approving review to merge, but re-approval if the branch changes. That’s the thing I don’t like—if someone has committer access, usually you should trust them to handle this kind of situation appropriately without a strict safeguard.)

                                    1. 1

                                      Re: 1, so create a process? For example, have some sort of review tracking system, and a cron job that cuts to a release candidate branch every N days up to the latest reviewed commit (including reverts as reviews). I think this largely handles the safe aspect too.

                                      Even in companies with up front reviews, it’s pretty common IME to have that sort of release process (e.g. for tests passing).

                                      Re: 2, reverts should be rare (how often do you create a PR and not merge it?), but also, in the cases they do happen they’re probably interesting code achaeology themselves. Like an implicit “I tried this and it didn’t work comment”.

                                      Re: 3, Definitely a potential cultural issue, but I don’t think it would be bound to pop up. The reviewer should definitely always have the option of saying “fix this” instead of fixing it themselves.

                                      And yes, your “approve with comments” workflow you describe below is one I use too, that’s sort of my baseline review workflow.

                                      1. 2

                                        Ah, so you’re saying fix-forward rather than revert when there are larger changes requested? I dunno, that goes beyond “optimistic merge” in my book. :-) Because then it turns into a drop-other-work rather than a come-back-to-it.

                                    2. 5

                                      LLVM has a notion of post-commit review. In general, nontrivial things should have pre-commit review, but once you’ve been contributing for a while you’re ‘allowed’ to use post-commit review. This is most useful if you’re working on a part of the system few other people are competent to touch: if you wait until they have time to review your code, it may be blocked for months and it’s better to have the code available for wider testing before then.

                                      This is coupled with a culture of ‘no-blame’ reverts though. If something is committed for post-commit review, anyone in the community is allowed to revert it if it looks broken, and to move it back to pre-commit review. I’ve had patches reverted like this, particularly when they’ve ended up breaking a test on some obscure platform. There’s an expectation that the person who reverted it will engage with the original author.

                                      This is a lot more feasible on a project that has a good CI system and a culture that new features must come with tests because then the chances of a change like this actually breaking things (rather than not implementing things in the best way) is much lower.