1. 34

  2. 43

    crossposting my comment from r/programming:

    Maybe I haven’t woken up yet today, but it seems like there is a pretty glaring error in this post. Towards the end, the OP gets to 2 seconds and declares that faster than Go, but the only the place blog post shows the Go runtime is in the output of go test -v -bench=., which at the bottom, reads:

    ok      _/home/cody/faster-command-line-tools-with-haskell/go   2.274s

    But that’s the total runtime for running the benchmark harness, not the runtime for each benchmark iteration. The time being recorded on the Haskell version seems to measure a single iteration using the time built-in command. If you look more closely at the benchmark output of the Go program, each run is about 0.25 seconds, which is considerably faster than the Haskell version.

    If I’m right about this error, then this should probably serve as an object lesson in benchmarking: always be aware of what you’re measuring, and when possible, use the same measuring tools across all samples. Or at the very least, be aware of the work each command you’re running is actually doing. In this case, it would probably be simplest to write a Go program and measure that with time.

    OK, I decided to check this for myself and it looks like I’m right and that there is an error here:

    $ git clone git://github.com/codygman/faster-command-line-tools-with-haskell
    $ cd faster-command-line-tools-with-haskell
    $ tar xf ngrams.tsv.tgz
    $ cd go
    $ $EDITOR goversion.go
    ... add a main function, see: https://gist.github.com/0d9cb7930e64a5819f0e1f01586ff1b9
    $ go build
    $ time ./go
    real    0m0.432s
    user    0m0.380s
    sys     0m0.054s
    $ cd ../haskell-version
    $ stack build
    $ time stack exec -- haskell-version
    real    0m3.569s
    user    0m3.438s
    sys     0m0.200s

    As far as I can tell, I’m running the Haskell program in exactly the same way as the OP. I ran each time’d command several times to control for I/O cache.

    If I run go test -v -bench=. in the go directory, then its “total” output time is a hair slower than the Haskell version’s runtime, which is consistent with my hypothesis about the error I think.

    NOTE that runtime for profiled builds will be much slower as in any language.

    This isn’t necessarily true and depends on how you’re profiling the program. For example, if you can meaningfully run your program with perf, then you can get profiles without substantially reducing the run time of your program. (perf works well with Go as of late.)

    1. 1

      I don’t know if it makes a big difference, but there is overhead from running stack exec.

      1. 1

        Yeah someone on reddit asked about that, but it doesn’t matter in this case. The process run time is long enough. I tried running the binary directly.