1. 20
  1.  

  2. 5

    Can you re-use the test suite if your entire software is replaced with an opaque neural network?

    Haha this is great, I’m writing it down.

    1. 2

      Thanks for sharing! Your writing continues to get better, keep it up :-)

      1. 2

        This is something I inflicted upon myself early in my career, and something I routinely observe. You want to refactor some code, say add a new function parameter. Turns out, there are a dozen of tests calling this function, so now a simple refactor also involves fixing all the tests.

        So use a default parameter. Problem solved. or use the O in sOlid. Problem solved.

        Or even, since most use cases got along happily without that parameter, maybe reach for the I in solId.

        Also use the “Given…When…() // Then….” pattern. So you know what each test cases is testing and why.

        If you come back a month later and the test is failing…. and giving a different answer? Do you know why the old answer was correct? And whether the new answer is correct (or incorrect)

        Or are you going to say, oh dear, the test is broken, I’ll change the test to check it does what it is currently doing…

        1. 1

          I think what you are saying is more or less equivalent to this:

          Provide the same backwards comparability guarantees for internal API that you do for public API on the system boundary.

          I agree that this will solve the problem of brittle unit tests. Cure is worse than the disease though, boundaries have a high cost.

          1. 1

            Adding a parameter doesn’t change the behaviour, so there shouldn’t be any need to change the unit tests, so a default parameter solves the problem. Or since all uses are in a single file search and replace.

            There may be a problem with brittle unit tests…. but this isn’t it.

            Where it might start biting you is if you have mocks everywhere, and tests unrelated to the change start breaking everywhere.

            Then the problem isn’t quite what you think it is.

            Mocks everywhere is a code smell.

            Can you just delete the mocks and use the real thing? Yup, do so, problem solved.

            Do you need mocks everywhere? Hmm, that’s a design smell, the pain is telling you something… listen to it.

        2. 1

          I like to think in terms of layers of stability. Application logic at the top, with a stack of libraries underneath, and a library is defined as an API that provides a stability guarantee. Often it’s third-party libraries and then they have their own tests. but sometimes it’s an internal library that’s only used in that one application, and then the stability guarantees can be much looser, but you likely still want to test it.

          App logic
          ----- library interface ---
          unstable implementation goop
          ----- library interface ---
          unstable implementation goop
          

          Which is to say, you might still need tests for some internal APIs, but definitely not all due to the costs mentioned in the article.

          1. 2

            it’s an internal library that’s only used in that one application, and then the stability guarantees can be much looser, but you likely still want to test it.

            I’d say it might be fine to just test it via an application, but yeah, if it is a library enough to have some sort of a interface, it’s better to test this interface. Basically, treat library like a layer from the layers section of the post.

            I’ve just realize that I have an appropriate war story to share. In rust-analyzer, we originally started with keeping the syntax tree library in-tree. Than, at some point we’ve extracted it into a stand-alone rowan package. One problem with it though is that all the tests are still in the rust-analyzer repo. This actually is rather OK for my self: I can easily build rust-analyzer test suite against a custom rowan and see if it breaks. It does make contributing to the library rather finicky though for external contributors, as testing workflow becomes rather non-traditional.

            1. 1

              I’ve made the mistake of testing all layers instead of only layers where stability was meaningful. So key I think is “where is stability a useful property”. Generalized version of “only test public APIs”, I think, since that’s not as meaningful in larger applications.

              1. 1

                Hm, I don’t think the mistake is necessary the subject of testing. It might be just the way tests are written.

                Here’s an example test which tests a very internal layer of rust-analyzer, but without stability implications:

                https://github.com/rust-analyzer/rust-analyzer/blob/5193728e1d650e6732a42ac5afd57609ff82e407/crates/hir_ty/src/tests/simple.rs#L91-L110

                It tests type-inference engine, and prints, for each expression, it’s type. Notably, this operates at the level of completely unstable and changing internal representation. However, because input is rust code, and output is an automatically updatable expectation, the test is actually independent of the specifics of type inference.