1. 26
  1. 9

    coughs design by contract

    1. 7

      The earliest you can crash is at compile time :p.

      1. 6

        I too believe in the “crash early and loudly” camp and use assert() quite liberally [1]. Coupled with that, I wrote code to log segfaults (and other “this is bad” signals) to syslog() for cases where I could not get a core dump. Even though the code is fragile and probably violates POSIX semantics, it works (at least well enough for me on Linux) and it has caught several bugs. In fact, when I first used it, it found a long standing bug that had been happening for years that I was unaware of.

        [1] Usually to assert that input parameters to functions are valid. I do NOT assert data from the I/O boundary though. But there are the occasional calls to assert() to ensure what I think is true, is true.

        1. 6

          Mostly agree, but be careful you aren’t setting yourself a trap.

          The resulting crash and stack trace will be extremely obvious, easy to debug, and fix. It may become clear that the invariant of bar always being null is not true.

          At this point, if you change the API to allow null and return an error, you have tons of existing callers to update, each of which may not themselves be prepared to return errors. It can be a lot of work, much of it precarious, to retrofit error handling into existing code. I’d estimate the effort involved is significantly higher than starting with error handling everywhere and removing it when it proves unnecessary.

          1. 1

            Definitely see your point, but in the alternative of leaving in error handling code instead of asserting, when do you determine when to remove it (i.e. when it “proves unnecessary”)? On the other hand it’s obvious when to add back in error handling code - you add it back in whenever you need it.

            1. 3

              Depends on the age of the code I guess. Then again, if the code is stable, the error handling isn’t really in the way either, since you’re not changing that code anymore.

          2. 5

            Assert early and assert often

            Reading this reminded me of Andrew Doull’s 2008 blog post: Asserting a Code Style. Andrew wrote and maintained the roguelike Unangband until early 2017, itself a derivative of Angband. Andrew is in the fail early camp, while the Angband code (and thus Unangband) was not:

            There are two schools of thought regards how your code should fail: either it should try to keep going as best it can, or it should hard stop. After many years in IT, I’m firmly in the hard stop school. It makes it far easier to debug, if your program fails at the point at which an exception occurs

            [Unangband] is written to the silently corrupt and experience weird errors standard.

            Andrew then goes on to describe a method for incrementally adding runtime checks to the Unangband code.

            His experience matches mine. I recall working on a data processing pipeline and adding a requirement that, when writing a file, at least one record must be written or it’s a fatal error. After doing so we immediately found a bug early in the pipeline the consequence of which didn’t manifest until near the end. We did also have to disable the error in places where we knew it was safe to have an empty file. On balance it was a win: we found exceptions early when it was cheaper to do so.

            1. 5

              In most cases I’d agree with him. I had examples like this where I would just throw an exception in PHP if some variable didn’t have the expected value. I figured at that time: if that happens, I don’t know how to handle it so let’s crash. Eventually it did happen once but the clear stack trace in the log made it very easy to locate and fix the issue.

              Crashes basically give clear indications that 1. something needs to be done and 2. how. The opposite is subtle invalid state that spreads through the app and you often don’t notice till it’s too late, and then it’s very hard to find the root of the problem.

              1. 5

                Depends on what “crash” means, in my experience, crashing a server due to an isolated issue in the handler of a specific kind of request isn’t a good idea. This tends to turn low priority issues into VERY URGENT MUST FIX DURING CHRISTMAS priority issues.

                Just abort anything that depends on your faulty assumptions, make a very loud noise and leave the rest of the system intact.

                1. 7

                  This is why rust an erlang are nice, very good isolation between independent threads of execution so aborts don’t take unrelated items down.

                  1. 5

                    As most things, it depends. If the alternative is for the program to continue running with invalid data, you might find after xmas that your database was being overwritten with garbage.

                    1. 3

                      That’s why I said “just abort anything that depends on your faulty assumptions”. I didn’t mean you should recover as much as you can.

                  2. 3

                    Was really hoping this was going to point to the Erlang model.

                    1. 2

                      Yeah, but I don’t think it’s a compatible model for most languages

                    2. 2

                      While I understand the sentiment, I wish we would use languages where it’s not possible to have a state that wasn’t explicitly designed by the programmer. If an input to a function can be null, it should be some kind of nullable type and you should deal with the null case in the function. If it can’t be null, make it a non-nullable type and require the caller to send you a valid value.

                      1. 2

                        Very few of us can actually wrap our minds around such formal thinking. For better or worse, “Make Computer Do Thing” is still the purpose of our profession, and things making it harder to attain that like the type beauty in Haskell don’t help us enough.

                      2. 1

                        the idea that any total program crash (via segmentation fault, panic, null pointer exception, assertion, etc.) is an indication that a piece of software is poorly written and cannot be trusted.

                        This is poorly written. Most programs do more than one thing and should not crash out unceremoniously. Very often catching general exceptions, logging out detailed tracebacks and continuing is to be preferred.