1. 29
  1. 5

    This is a great testing approach! Without going into details, the bulk of tests indeed should be just data, with just a smidge of driver code on top.

    Another important consideration is that the existence of a package for this is mostly irrelevant— you should just build it yourself (about 200 loc for MVP I suppose?) if there isn’t one for your ecosystem. It’s not a hidden gem, it’s more of an leftpad (considering the total size of tests you’d write with this).

    One thing I would warn about here is that these specific tests, which work via process spawning, are quite slow. If you have hundreds/thousands of them, process spawning would be a major bottleneck. Unless the SUT spawns processes itself, you’d better of running a bunch of tests in process.

    I think the example from the post shows this problem:

    https://github.com/encoredev/encore/blob/main/parser/testdata/cron_job_definition.txt

    Rather than invoking parse process via testscript and asserting stdout/stderr, this should have used just the txtar bit, and the parse function for an order of magnitude faster test.

    Some real world experience with this problem:

    • in Kotlin native, switching from many processes to single process for tests reduced the overall time massively (don’t remember the number)
    • rustc testsuite is very slow due to this problem. However, as rustc build times are just atrocious due to suboptimal bootstrapping setup (both compiler and stdlib/runtime are magic code which need to be bootstrapped; a better setup is where compiler is just “normal Rust crate”), the testing time overall isn’t a pain point.
    • cargo testsuite is very slow, but, as cargo is fundamentally about orchestrating processes, there isn’t much one can do there.
    1. 5

      Author of the post here. Note that “parse” here does invoke a function, not a binary. You have to specify “exec” to execute a subprocess.

      1. 2

        Ah, I see now. Yeah, then this is just the perfect setup!

      2. 2

        Another important consideration is that the existence of a package for this is mostly irrelevant— you should just build it yourself (about 200 loc for MVP I suppose?) if there isn’t one for your ecosystem. It’s not a hidden gem, it’s more of an leftpad (considering the total size of tests you’d write with this).

        You seem to be saying (or implying) two different things here: (1) this is not a hidden gem: it’s like leftpad, and therefore you should write this yourself (the last clause is mine, but it seems—maybe?—implied by the first clause); (2) if this didn’t already exist, you could write it yourself. (2) seems fine, though—for the record—not true for many values of yourself. (I doubt I would write this myself, to be explicit.) (1) seems like a terrible case of NIH syndrome. This package is significantly more than 200 lines of code and tests. Why would I want to reinvent that?

        Finally my recollection is that left-pad was four lines, maybe less. There’s simply no comparison between the two projects. (I checked, and the original version of left-pad was 12 lines: https://github.com/left-pad/left-pad/commit/2d60a7fcca682656ae3d84cae8c6367b49a5e87c.)

        1. 1

          You seem to be saying (or implying) two different things here: (1)

          Tried to hedge against that reading with explicit “if there isn’t one for your ecosystem”, but apparently failed :-)

          I doubt I would write this myself, to be explicit This package is significantly more than 200 lines of code

          So this line of thinking is what I want to push back a bit, and the reason why I think that part of my comment is useful. It did seem to me a bit that this was presenting as a complicated black box which you are lucky to get access to, but implementing which is out of scope. I do want to say that it simpler than it seems, in the minimal configuration. the txtar part is splitting by regex and collecting the result in Map<Path, String>. The script part is splitting by lines, running each line as a subprocess and comparing results. This is simple, direct programming using basic instruments of the language/stdlib – no fancy algorithms, no deep technology stacks, no concurrency.

          Now, if you make this the main driver for your testsuite, you’d want to add some fanciness there (smarter diff, colored output, custom commands). But such needs arising mean that you testsutie is using the tool very heavily. If you have 10_000 lines of tests, 1000 lines of a test driver are comparatively cheap. left-pad is small relative to its role in the application – padding some strings here and there. Test driver is small relative to its role in the application – powering majority of tests in a moment, supporting evolution of the test suite over time, and being a part of every edit-compile-test cycle for every developer.

      3. 3

        That’s a really cool test harness. Thanks for sharing.

        1. 2

          Feels somewhat similar to a test harness I wrote for testing Oil, e.g.

          https://github.com/oilshell/oil/blob/master/spec/xtrace.test.sh

          You write some shell, then put STDOUT blocks and STDERR blocks. (I used a little recursive descent parser where the tokens are lines!)

          After I wrote it I found that other shells have more or less the same thing. I think there was one for the OpenBSD shell in Perl etc.

          1. 2

            Similar ideas in/for other languages:

            • cram - arguably the most language agnostic, as test files are enhanced shell scripts
            • ppx_expect for OCaml, inspired by cram
            • Rust has expect-test (by matklad?) and insta.
            1. 2

              I wanted something like this that wasn’t Go-specific, so I made this: https://github.com/deref/transcript/

              Yes, it’s implemented in Go, but it’s mainly intended to be driven via the CLI (though there is also a Go API), so it can be used in any language ecosystem.

              Transcript doesn’t support some things that testscript does (like embedded files), but it does have an interactive record/update workflow.

              See also https://github.com/google/go-cmdtest

              1. 2

                I saw this post, and even though I had seen testscript before, it pushed me over the hump to use it with one of my projects: https://github.com/carlmjohnson/versioninfo/pull/2/files

                Some context implied but not stated is that testscript was created to test the go tool itself. If you write normal Go code, you don’t really need it. Where it helps is when you need, eg, to test compiler flags, like my project above. Normal code can just use Go’s built in test package without the rigmarole of invoking command line commands and looking at stdout.

                1. 2

                  Some context implied but not stated is that testscript was created to test the go tool itself.

                  Well, the post includes this:

                  testscript was originally created for testing the Go compiler itself. It offers an easy way to define files with specific contents, and then assert that certain invocations of the go command produces certain outcomes: successfully building a binary, returning a particular error, printing a particular line on stdout, and so on.

                  That said, I suppose the author doesn’t stress the point later.