1. 12

  2. 7

    I can extend this theory with one variation I saw in real life while working on banking software. The author seems mostly worried about the quality of code, the obvious duplication (increasing maintenance) and loosing control of the feature set/dependencies used by the project.

    The software I mentioned had 60 active developers & around 2 million lines of code. On one day we had a bug report that could be brought down to a type coercion error which resulted in an exception being raised. We spend two days hunting that issue down and finally found the culprit. It was an issue in the error handling code path. Take this pseudo-code example

    func test(a string, b string)
       lots of code & function calls
     catch all(ex):
       return errors.handle_error(ex.code, ex.msg);

    The initial assumption was that the bug resided inside the body of the test function. The reality was the errors.handle_error signature:

    errors.handle_error(string, int)

    the language raised an exception covering the actual cause of the bug. We spend two days as we didn’t see anything in the code that could have caused the coercion error in the body of the code. Do note that this was a ‘dead’ bug report. We only had the error message and the stack trace - no live system to debug.

    The bad part? We took a look at the rest of the code base. It was quickly revealed that this incorrect error handling pattern was copied over 700 times in the code base which meant 700 more hard to diagnose issues hiding the actual business/code issue that caused the exception in the first place. All of those bugs were ‘dormant’ until an exception in one of them triggered the buggy error handler. We took another working day and fixed them all which I think is as important as preventing situations like this before going inside the code. The worst part was actually revealed while performing the fix - one of the packages had the error handling already fixed, the developer just didn’t took the time to fix other occurrences which costed my team 2 days.

    I also saw people copying code that was never executed (literally dead) moving it into their own newly coded solution/fix essentially reanimating the dead code and exposing dormant bugs in it.

    1. 1

      Ouch! Scary. I take it the parameters were backwards?

      This is the sort of thing where it’s definitely incredibly important to fix everywhere when you find it. I can only imagine that other developers must have been ignoring bug reports for a long time because the stack traces weren’t actionable.

      1. 1

        Correct. The parameters were reversed.

        On your second question - no. Software this big (2 millions lines of code) with such few users (around 10-20 BIG financial institutions covering almost the whole market in Poland) doesn’t hit every corner case that often but when they do stuff is on fire.