Threads for harpocrates

  1. 7

    it takes advantage of compilation model based on proper modules (crates)

    Hang on a second, crates are not ‘proper’ modules. Crates are a collection of modules. If any file in the crate changes, the entire crate needs to be recompiled.

    ‘Proper’ modules are ones where the unit of compilation is the file, and the build can be parallelized and made much more incremental.

    The first advice you get when complaining about compile times in Rust is: “split the code into crates”.

    When files are compilation units, you split the code into files, which is standard development practice.

    Keeping Instantiations In Check…If you care about API ergonomics enough to use impl trait, you should use inner trick — compile times are as big part of ergonomics, as the syntax used to call the function.

    I think something has gone off the rails if, in the normal course of writing code, you need to use sophisticated techniques like these to contain build times. Hopefully it gets better in the future.

    Great article though.

    1. 10

      Yeah, agree that calling something “proper” without explaining your own personal definition of proper is wrong. What I’ve meant by “proper modules” is more-or-less two things:

      • code is paresd/typechecked only once (no header files)
      • there are explicit up-front dependencies between components, there’s no shared global namespace a-la PYTHONPATH or CLASSPATH

      So, I wouldn’t say that pre C++20 compilation model has “proper modules”.

      That being said – yes, that’s a very important point that the (old) C++ way of doing things is embarrassingly parallel, and that’s huge deal. One of the problems with Rust builds is that, unlike C++, it is not embarrassingly parallel.

      I’d actually be curious to learn what’s the situation with C++20 – how template compilation actually works with modules? Are builds still as parallel? I’ve tried reading a couple of articles, but I am still confused about this.

      And yeah, it would be better if, in addition to crates being well-defined units with specific interfaces, it would be possible to naively process every crate’s constituting module in parallel.

      I think something has gone off the rails if, in the normal course of writing code, you need to use sophisticated techniques like these to contain build times.

      To clarify, there’s an or statement, in normal application code one should write just

      pub fn read(path: &Path) -> io::Result<Vec<u8>> {
        let mut file = File::open(path)?;
        let mut bytes = Vec::new();
        file.read_to_end(&mut bytes)?;
        Ok(bytes)
      }
      

      There’s no need to make this template at all, unless you are building a library.

      But I kinda disagree with the broader assertion. Imo, in Rust you absolutely should care about compile times when writing application code, the same way you should care what to put in a header file when writing C++. I think we simply don’t know how to make a language which is both fast to run and fast too compile. If you chose Rust, you choose a bunch of accidental complexity, including slower built times. If you don’t care about performance that much, you probably should choose a different language.

      That being said, I would love to read some deeper analysis of D performance though – my understanding is that it, like C++ and Rust, chose “slow compiler” approach, but at the same time compiles as fast as go? So maybe we actually do know how to build fast fast to compile languages, just not too well?

      1. 2

        I’d actually be curious to learn what’s the situation with C++20 – how template compilation actually works with modules?

        I believe pretty much like in Rust: templates are compiled to some intermediate representation and then used during instantiation.

        Are builds still as parallel?

        No, again the situation is pretty much like in Rust: a module interface is compiled into BMI (binary module interface; equivalent to Rust’s crate metadata) and any translation unit that imports said module cannot start compiling before the BMI is available.

        I also agree that C++20 module’s approximate equivalent in Rust is a crate (and not a module).

        BTW, a question on monomorphization: aren’t argument’s lifetimes also turn functions into templates? My understanding is that while in C++ we have type and value template parameters, in Rust we also have lifetime template parameters which turn Rust into an “almost everything is a template” kind of language. But perhaps I am misunderstanding things.

        1. 5

          No, again the situation is pretty much like in Rust: a module interface is compiled into BMI (binary module interface; equivalent to Rust’s crate metadata) and any translation unit that imports said module cannot start compiling before the BMI is available.

          Thank you! This is much more helpful (and much shorter) than the articles I’ve mentioned.

          BTW, a question on monomorphization: aren’t argument’s lifetimes also turn functions into templates

          That’s an excellent question! One of the invariants of Rust compiler is lifetime parametricity – lifetimes are completely erased after type checking, and code generation doesn’t depend on lifetimes in any way. As a special case, “when the value is dropped” isn’t affected by lifetimes. Rather the opposite – the drop location is fixed, and compiler tries to find a lifetime that’s consistent with this location.

          So, while in the type system type parameters, value parameters and lifetimes are treated similarly, when generating machine code types work like templates in C++, and lifetimes roughly like generics in Java. That’s the reason why specialization takes so much time to ship – it’s very hard to make specialization not depend on lifetimes.

        2. 1

          Well your definition is interesting, because I actually don’t (exactly) agree.

          • code is paresd/typechecked only once (no header files)

          In OCaml you have ‘header’ or rather interface files, which are parsed/typechecked only once. They’re the equivalent of C++ precompiled headers except the compiler does it automatically.

          • there are explicit up-front dependencies between components, there’s no shared global namespace a-la PYTHONPATH or CLASSPATH

          Again in OCaml, there are dependencies between modules, but they are implicit. They just fail the build if modules are not compiled in the right order with the right dependencies. Fortunately the build system (dune) takes care of all that.

          Also OCaml has a shared global namespace–all source code files automatically become globally-visible modules within the project. Again fortunately the build system provides namespacing within projects to prevent name clashes.

          Another example of ‘proper modules’ is Pascal’s units, which actually satisfies both your above criteria (no header files, and explicit dependencies between units), and provides embarrassingly-parallel compilation.

          I think we simply don’t know how to make a language which is both fast to run and fast too compile.

          That may well be true.

          D

          From what I’ve heard, D compiles fast. And I assume it runs fast too. OCaml is pretty similar e.g. in some benchmarks it has similar single-core performance to Rust.

          1. 1

            Yeah (and here I would probably be crucified), I’d say that OCaml doesn’t have proper modules :-) Compilation in some order into a single shared namespace is not modular.

            Rust’s approach with an explicit DAG of crates which might contain internal circular dependencies is much more principled. Though, it’s sad that it lost crate interface files at some point.

            1. 1

              I’d say that OCaml doesn’t have proper modules :-)

              Heh, OK then ;-)

              Compilation in some order into a single shared namespace is not modular.

              That’s exactly what Rust crates end up doing. It just shifts the concept of ‘namespace’ into the crate names. Same with Java, C#, etc. Rust:

              use serde::{Serialize, Deserialize};
              

              OCaml:

              Serde.Serialize.blabla
              
              1. 2

                It just shifts the concept of ‘namespace’ into the crate names

                Not exactly: Rust crates don’t have names. The name is a property of the dependency edge between two crates. The same crate can be known under different names in two of its reverse dependencies, and that same name can refer to different crates in different crates.

                1. 1

                  I think this is a distinction without a difference. Rust crates have names. They’re defined in the Cargo.toml file. E.g. https://github.com/serde-rs/serde/blob/65e1a50749938612cfbdb69b57fc4cf249f87149/serde/Cargo.toml#L2

                  [package]
                  name = "serde"
                  

                  And these names are referenced by their consumers.

                  The same crate can be known under different names in two of its reverse dependencies

                  But they have to explicitly rename the crate though, i.e. https://stackoverflow.com/a/51508848/20371 , which makes it a moot point.

                  and that same name can refer to different crates in different crates.

                  Same in OCaml. Different projects can have toplevel modules named Config and have them refer to different actual modules. If there is a conflict it will break the build.

                  1. 2

                    If there is a conflict it will break the build.

                    The build will work in Rust. If, eg, serde some day publishes serde version 2.0, then a project will be able to use both serdes at the same time.

                    So, eg, you can depend on two libraries, A and B, which both depend on serde. Then, if serde published 2.0 and A updates but but B does not, your build will continue to work just fine.

                    Not sure if OCaml can do that, but I think Java (at least pre modules)/C/Python can’t, without some hacks.

                    1. 1

                      That is a design choice. OCaml makes the choice to not allow duplicate dependencies with the same name.

        3. 4

          What do you mean when you say ‘proper’ modules?

          I understand the author mean a compilation unit, so a boundary around which you can count on compilation being separable (and therefore likely also cacheable). In C and C++, that happens to be a file (modulo some confusion about the preprocessor). In Rust, it is a collection of files called a crate. Once you accept that, everything else you say about file-based compilation units holds for Rust crates too: you can parallelize compilation of crates, you can cache compiled crates, and you get faster compile times from splitting code into crates.

          1. 1

            I defined ‘proper’ modules in my comment second paragraph.

        1. 1

          ghc – Uses the Happy parser generator (LALR/GLR?)

          GHC uses Happy with an LALR grammar. The Happy parser generator itself has some additional support/options for GLR, although I’m not sure how much maintenance and use that codepath has gotten over the years.

          1. 5

            What does “teaching/coding language” mean? What are you even trying to teach? I think the subtext is that all the things the IDE make convenient, things that you would need to learn if you were using Vim, they are not coding? Am I right?

            Why not just write that then? I am not saying IDE’s are bad, but now that we are implicitly redefining words, why don’t you just let me redefine coding as “non-HTML based cross-platform GUI coding”? Now F# suddenly doesn’t seem like a good “coding language” ;)

            Here comes my typical Haskeller retort: because F# doesn’t have a sufficiently strong type system (effectful code based on higher-ranked polymorphism is not popular in F#), you are not really doing FP. Without higher-rank polymorphism, you’re stuck in the pure part of FP, which is the easy one. F# jumps the shark on the hard parts, and it isn’t convenient to write modularized effectful code.

            So is the implication here that “teaching language” means you don’t need to worry about effects? I think teaching should be about doing it right. If F# guides you down a wrong path, is it really a good teaching language?

            I’d agree more to something like “beginner FP language”.

            1. 5

              because F# doesn’t have a sufficiently strong type system (effectful code based on higher-ranked polymorphism is not popular in F#), you are not really doing FP. Without higher-rank polymorphism, you’re stuck in the pure part of FP, which is the easy one. F# jumps the shark on the hard parts, and it isn’t convenient to write modularized effectful code.

              For anyone unfamiliar with these terms: functional programming and higher-rank polymorphism are completely unrelated concepts.

              Functional programming is about avoiding mutation and side effects, and higher-rank polymorphism is a type-checking feature. Functional code doesn’t stop being functional if you remove the type-checking step from your build, so it’s not possible for higher-rank polymorphism to be a requirement for doing FP.

              1. 6

                I don’t understand this FP gatekeeping.

                Functional programming is about avoiding mutation and side effects, and higher-rank polymorphism is a type-checking feature. Functional code doesn’t stop being functional if you remove the type-checking step from your build

                A couple years back, there were lots of posts about JavaScript being functional. At that time, I remember hearing arguments that closures and functions were what functional programming is really about.

                Personally, I’m inclined to slightly agree with the grandparent comments: without higher ranked types it is tricky to represent IO in an immutable fashion. Since the alternative is to bail out and just use mutation (as F# does) then it isn’t as functional as it could be (by your own definition of “avoiding mutation and side effects”).

                1. 1

                  without higher ranked types it is tricky to represent IO in an immutable fashion. Since the alternative is to bail out and just use mutation (as F# does) then it isn’t as functional as it could be (by your own definition of “avoiding mutation and side effects”).

                  Elm is a typed pure functional language without higher-ranked types, and Elm is even stricter about purity of effects than Haskell, which has unsafePerformIO. (Elm has no equivalent of that.)

                  I don’t think there’s anything particularly tricky about it!

                  1. 5

                    Elm chooses to limit pretty strongly what sort of IO the user does to the point that I’m not sure it counts as a general purpose programming language. For example: what is the type of a function that reads a file? How do you capture those effects? You can definitely avoid using monads for IO, but IMO it leads to less powerful or less usable systems. For example: Haskell initially modeled IO using infinite lists (and it wasn’t very usable).

                    unsafePerformIO is a compiler hack. It isn’t part of the Haskell specification, it is just used for FFI interop with other languages. EDIT: the legitimate use cases for unsafePerformIO are extremely rare and are (to my knowledge) always around some performance optimization that is made outside of Haskell 98.

                    1. 1

                      You can definitely avoid using monads for IO, but IMO it leads to less powerful or less usable systems.

                      Elm does use monads for I/O, it just doesn’t call them that.

                      Here is Elm’s equivalent of Haskell’s IO (except that it includes error handling, which Haskell’s IO doesn’t):

                      https://package.elm-lang.org/packages/elm/core/latest/Task#Task

                      Nothing tricky about it imo.

                      1. 2

                        Can I define my own Task to call a C function using FFI? Can’t find it.

                        1. 3

                          Only if your name starts with Evan. ;-)

                          1. 1

                            Elm compiles to JavaScript, so no.

                            Given that the thread was about whether functional programming has anything to do with higher-ranked types, I’ll assume this change of topic means that discussion is over.

                          2. 2

                            Oh man, that link reminds me of Elm’s time handling approach that made me want to take Evan’s hand and tell him “Kid, it’s not as easy as you think it is. You think you had a really smart idea there, but really, you didn’t.”.

                            1. 1

                              Interesting - I had the opposite reaction. I’ve never used a time API I liked as much as Elm’s. What didn’t you like about it?

                              1. 1

                                From an ergonomic perspective, an example problem (of many!) is the choice to make time zones explicit. I suppose that this comes from the SPA tradition, where ad-hoc user input (“Choose your language”) or Web browser APIs are used to establish a preferred presentation language instead of content negotation. However, just like with the Unicode sandwich technique, we can have a UTC Sandwich design where times on Earth are handled uniformly, and only the outermost presentation layers need to worry about timezones.

                                Worse, in my opinion, is the decision to make access to timers unprivileged. This invites timing side-channels. I know of no solutions besides making them explicit parameters instead of allowing them to be ambiently imported.

                                In general, Elm’s libraries and designs are oriented towards SPAs but not towards backend services, ruining the hope of so-called “isomorphic” deployments where identical code runs in the Web browser and the backend.

                                1. 1

                                  From an ergonomic perspective, an example problem (of many!) is the choice to make time zones explicit. […] However, just like with the Unicode sandwich technique, we can have a UTC Sandwich design where times on Earth are handled uniformly, and only the outermost presentation layers need to worry about timezones.

                                  I personally like this technique - explicit time zones are one of my favorite parts about the design of the elm/time package - but fair enough if your preferences differ from mine!

                                  Worse, in my opinion, is the decision to make access to timers unprivileged. This invites timing side-channels. I know of no solutions besides making them explicit parameters instead of allowing them to be ambiently imported.

                                  Hmm, how would a language (any language!) support even basic current-time use cases (like obtaining a current timestamp) without making timing attacks possible?

                                  1. 1

                                    Time-zone handling isn’t just a preference. We’ll necessarily incur an extra table lookup if we want to decode a historical timestamp which is relative to some time zone, so it’s slower. That extra table has to be from an old time-zone map, so we must store an ever-growing number of old time-zone maps, so it’s bigger. And it is another thing that programmers might get incorrect, so it’s got more potential for bugs.

                                    By “explicit parameters” for timers, I mean that the application’s entry points would accept timer values as parameters, and invoking those timer values would produce timestamps. Timing attacks are still possible, but they no longer have the potential to run rampant through a codebase. For a practical example, this Monte module implementing Python’s timeit algorithm is parameterized upon some timer object, but doesn’t have automatic access to any privileged clocks or timers. This module drawing some console graphics explicitly requests the system clock Timer and syntactically reveals where it is used.

                                    1. 1

                                      We’ll necessarily incur an extra table lookup if we want to decode a historical timestamp which is relative to some time zone, so it’s slower.

                                      I see - so, to check my understanding: for reasons beyond your control, it’s stored in the database (or you get it from an external data source) relative to a particular time zone, and you want to decode it (and then work with it?) while staying relative to that time zone?

                    2. 2

                      functional code doesn’t stop being functional if you remove the type-checking

                      Like the “Rust is NP-hard” post shows, the type system is also used for code generation. So I can’t even run my code after removing the type checker.

                      We don’t need to mince words about what FP is, because this is a discussion about what F# is missing. I should have said “pure typed FP”. Would you agree with me then?

                      1. 3

                        I should have said “pure typed FP”. Would you agree with me then?

                        Also no; a counterexample would be Elm, which is a typed pure functional language without higher-rank polymorphism!

                        1. 7

                          You can also have an effect system (like Koka) and still not have higher-kinded types. These are only necessary if you want to abstract over monads.

                  1. 33

                    This post is missing actual content. After reading “I make that claim based on three assumptions:” and then the assumptions, I expected to see some attempt at justifying why F# is strong in those areas. Based on the criteria given (which I’m not sure I align with), I would say a language like Kotlin would actually be a better contender:

                    • The tooling for Kotlin is pretty good. It was initially started by JetBrains (so stellar IDE) and since Google made it the preferred language for Android, the situation has only improved.
                    • Kotlin has all the “functional” features that F# has (type inference, pattern matching, first class functions). They both lack type classes or higher kinded types.
                    • The developer community for Kotlin is quite a bit bigger than F#. I’m wasn’t able to find a programming language index where Kotlin didn’t outperform F# (and sometimes F# wouldn’t even show up on the index at all!).
                    1. 10

                      Kotlin doesn’t have a classic ML-style implementation of pattern matching.

                      1. 5

                        This is what I was trying to say. It felt like an introduction to an article … but the body of the article is missing.

                        1. 3

                          Yes, I agree. Just about any language could be a part of that article. And it’s not even qualifying any of t he assumptions.

                          Practically a shit post. “F# is the best!” and then goes away without explaining.

                        1. 4

                          In my experience, TCO is often relied on for correct behavior as opposed to just being bonus performance. That means the following is a pretty significant downside!

                          But since it is only applied under certain circumstances, the downside of is that when it is not applied, we won’t be made aware of it unless we check for it.

                          Are there any plans to add some sort of syntax to indicate to the compiler that it should error if it can’t perform TCO? OCaml has @tailcall for this purpose and Scala has @tailrec, although in Scala’s case the compiler won’t even try to do TCO unless you request it.

                          Also: how does Elm handle source maps for TCO functions? As I recall, increased debugging difficulty was one of the reason V8 backed out of automatically doing TCE (and switched to backing explicit tail calls instead).

                          1. 2

                            The article buries the lede, but it’s exactly announcement of a tool that checks Elm code for TCO.

                            1. 1

                              Maybe my coffee hasn’t fully kicked in yet, or maybe it’s been too long since I’ve programmed in a proper functional language, but how or when would TCO change behavior?

                              1. 4

                                One example that comes to mind: TCO can be the difference between “calling this function with these arguments always returns the correct answer” and “calling this function with these arguments sometimes returns the correct answer, and sometimes crashes with a stack overflow.”

                                1. 1

                                  Put slightly differently, TCO makes some partial functions total.

                                  1. 3

                                    If running out of stack frames and/or stack memory counts as making a function partial, then does the OS possibly running out of memory mean that no functions are ever total?

                                    Right? Since “the stack” isn’t an explicit abstraction in most programming languages, I don’t think it’s quite correct/useful to say that a recursive function is partial when it can’t be TCO’d.

                                    1. 3

                                      I don’t think it’s out of bounds to say that. It really depends on the conceptual model that your language is providing. For example, it seems to be an operating principle of zig: every function in the stdlib that allocates takes an allocator so you can handle out of memory exceptions intelligently.

                                      But, I get your point: it isn’t an explicit part of the conceptual model of most languages so it’s shifting the window a bit to refer to non-TCO functions as partial. I think it’s potentially useful perspective and, for what it’s worth, most languages don’t really describe their functions as total/partial anyways.

                                2. 2

                                  Recursion in a language without TCO feels like a fool’s errand. Source: I tried to implement some recursive algorithms in C…. on 16-bit Windows. On a modern CPU, you can probably get away with recursion even if it eats stack, because you have virtual memory and a shitload of address space to recurse into. Not so much on a 286….

                                  1. 1

                                    I definitely agree! I never, ever, write recursive code in a language that doesn’t have a way to at least opt-in to recursion optimization.

                                    But to me, that’s still “performance” and not really “behavior”. But maybe I’m defining these things a little differently.

                                  2. 1

                                    Not sure if @harpocrates meant this but:

                                    Often, if the recursion depth is large enough, the unoptimized version uses a lot of stack space, even potentially an unbounded amount, where the optimized version uses constant space.

                                    So the unoptimized version is not only slower but actually crashes if the stack is used up.

                                    1. 1

                                      Hmm. I assumed that “bonus performance” would include memory usage. And I would’ve lumped the resulting stack overflow in with “performance” concern, but I guess I can see how that might actually be considered behavior since the “optimized” version will complete and an unoptimized version might not.

                                      It’s just weird because I don’t think anybody would tell me that putting an extra private field that I never use on a class would be a “behavior change” even though that makes my class use more memory and therefore will OOM on some algorithms when it might not OOM if I remove the field.

                                      1. 1

                                        An additional private field in a class that is used a million times on the stack might be similar, true. The heap may often be bigger (citation needed).

                                        With recursive calls, you can use up a lot of memory for innocent looking code that e.g. just sums up a list of integers.

                                1. 7

                                  I wonder, what is the historical reason for this style not being default? I’ve witnessed two transitions from “continuation indent” (A) to “single indent, ) goes to the new line” (B).

                                  Kotlin code was originally written with A, because code in IntelliJ was written with A, but then they switched to B for the official style. rustc originally used A because ???, and then, with the introduction of rustfmt, it was switched to B.

                                  B also seems much easier to maintain without tooling, as you don’t have to manually align parameters. Why is it that people seem to prefer A by default?

                                  1. 3

                                    (A) is sometimes an initial choice because a language does not initially support trailing commas. Without trailing commas, you lose the benefit of reducing line-diffs by putting the closing paren on the next line. However, once the language picks up more traction, the complexity involved in adding support for trailing commas might be outweighed by their convenience (for formatting, code generation, etc.). If support for trailing commas does get added, the line-diff argument becomes a pretty strong argument for (B).

                                    I think this might even have been the case for Kotlin, which I don’t think had trailing commas in its initial release, although I can’t tell if that was because Java didn’t support trailing commas, or because the preferred style was (A).

                                    1. 2

                                      I wasn’t there when this history was being made but I’ve heard the following from the veterans of those early days:

                                      • it looks “wrong” for some Pythonistas to return to the original indentation level before the entire block starting with, say, an if statement is complete;
                                      • early editor support like Emacs didn’t properly indent Python code based on the colon but instead on the first non-whitespace token in the line. So it would indent a line that started with if but it wouldn’t care about whether there’s a colon at the end. This makes the sadface indent style clumsy to use in such editors;
                                      • it’s subjectively “ugly” because the bare combination of a closing paren with a colon looks alien.
                                    1. 6

                                      Although I’m told there are good reasons for it, the fact that Scala 3 is developed and maintained in a separate repo than Scala 2 continues to concern me. That ecosystem has clearly been good for research and contributions from post-docs, but I worry it isn’t as good at avoiding regressions.

                                      The natural consequence of the divergence has been increased contribution overhead to the compiler (two PRs over one), which leads to some patches only ending up in one version. Also, since some of Scala 3 was developed from scratch, there are still a lot of Scala 2 features not in Scala 3 (I think the entire optimizer is gone, the REPL functionality is different, there are still compiler options missing). Going through the git history is also not particularly fun: you sometimes end up in massive patches that have commit messages like “copy files over from Scala 2”.

                                      1. 6

                                        I read through the corresponding Java source (EnumSet, RegularEnumSet, JumboEnumSet), and it seems that the main extra work done by java.util.EnumSet is bounds-checking. Additionally, more than 64 fields per enum are supported, although that code path is probably not entered by the benchmark. It is good to know the costs of safety and feature-completeness.

                                        1. 5

                                          It is good to know the costs of safety and feature-completeness.

                                          Is that cost unavoidable though? Ada’s representation clauses are just as safe and feature-complete (I’d argue even more feature complete, as you can have parts of your bitfields span multiple bits!) while being more performant. Take a look at the examples here if you’re curious: https://docs.adacore.com/gnat_rm-docs/html/gnat_rm/gnat_rm/representation_clauses_and_pragmas.html#record-representation-clauses

                                          1. 2

                                            The cost is not nearly as high as the blog suggests (probably less than a 2x factor in general). Yes, it isn’t trivial, but it isn’t orders of magnitudes. The OP is just not benchmarking properly (see the other comments for more). Unless the codepaths for more than 64 fields are exercised, the JIT will pretty quickly realize those are unlikely (and optimize accordingly).

                                            Very few JVM abstractions are “zero cost” (the way you could argue they often are in Rust), but OTOH the JVM does a pretty good job of optimizing at runtime based on your actual workload.

                                          1. 12

                                            I’m sorry, but this is not how you write micro-benchmarks. In this case, the methodology does end up affecting the results: when you fix that, bitfields don’t end up being nearly so much faster. The author is also clearly biased, and seems more concerned about reaching the conclusions that they want than being correct (seeing 0.000000057 should cause them to question their entire benchmark process, not to conclude that bitfields are obviously much better).

                                            Rather than just complain, I ported this benchmark to JMH (full code here - I’m not a JMH expert, so please point out flaws) and the numbers I got indicate something much more reasonable:

                                            [info] Bench.enumSet  avgt  100   6.605 ± 0.173  ns/op
                                            [info] Bench.hashSet  avgt  100  83.031 ± 9.414  ns/op
                                            [info] Bench.maskSet  avgt  100   4.200 ± 0.123  ns/op
                                            

                                            In case you are wondering, the more than half of the hashSet time is actually just creating the hash set.

                                            That really brings me to the final point: this benchmark tells us very little about HashSet, EnumSet, or bit flags. It doesn’t mix in multiple checks (you should probably benchmark groups of operations so the JIT doesn’t overfit in a way it couldn’t in a real example) and it doesn’t check different operations.

                                            1. 1

                                              I guess the Typescript one would need to be solved by restricting array types to have no implicit casting and give the programmer enough rope to do explicit casting if they wish to.

                                              1. 2

                                                That particular soundness bug runs deeper than the array type: Typescript lets generics on property types (even mutable ones!) and function parameters be covariant. This is a pretty big soundness hole, but it apparently is very useful for typing typical JavaScript code. You could fix the soundness issue (ex. with variance annotations), but that would run counter to Typescript’s main goal of being a compile-time layer over top of existing idiomatic JavaScript.

                                              1. 4

                                                Anyone have a snippet of the kind of code that C2rust ends up generating? Would be interesting to see how much work cleanup would end up being

                                                1. 5

                                                  I did a fairly deep investigation but it was a while ago, the tools have probably changed since: https://wiki.alopex.li/PortingCToRust

                                                  1. 4
                                                    1. 4

                                                      Not sure if the website is being kept still up to date, but www.c2rust.com let’s you try it online. The code is pretty un-idiomatic (using unsafe points instead of references and so on), but note that there is also a refactor step being actively worked on. Based on the video, it doesn’t look like they used that here, but this is where the code lives: https://github.com/immunant/c2rust/tree/master/c2rust-refactor.

                                                      Full disclosure: I used to work on this, but I’ve not kept up with developments from the past few months.

                                                      1. 2

                                                        Here’s dwm converted with it.

                                                        1. 2

                                                          You can see yourself. There’s an online translator: https://c2rust.com/

                                                          Rust intentionally tries to discourage use of raw pointers and pointer arithmetic by not having a syntax sugar for them, so directly translated C code ends up very ugly.

                                                          It starts looking OK only after you disambiguate pointers into proper references/slices/optionals. In C it’s not clear whether a pointer argument to a function means 0 or 1, exactly 1, or N elements. Rust has separate types for these cases, and syntax sugar and safety comes with them.

                                                          1. 1

                                                            It’s a start though! I’d rather start from there, and refactor towards idiomatic than start from scratch in some cases.

                                                        1. 5

                                                          I can’t wait to see performance benchmarks for this! Pie in the sky dream: if this started being competitive with DirectWrite and CoreText it would be perfect for Alacritty - ligatures for free and a cross-platform Rust dependency instead of a mess of cfg’s.

                                                          1. 5

                                                            Note that this is not a DirectWrite competitor, although it can be a part of that. Allsorts does no rendering.

                                                            1. 1

                                                              I don’t think Alacritty uses DirectWrite for rendering though (so as far as Alacritty is concerned, it is a competitor). But it is the case that DirectWrite and CoreText do interesting rendering work that is out of scope for Allsorts.

                                                            2. 2

                                                              That would indeed be pretty cool. I’d like to see it get worked into the existing font/type ecosystem in Rust to make adding it to a project easier. That’s probably not a work time project though. :)

                                                            1. 2

                                                              A common real-world example for Church encoding in Haskell is parsing monads. Both parsec and megaparsec use it. My understanding is that GHC does a great job of inlining functions, so higher order parser combinators end up being better optimized. For instance:

                                                              newtype ParsecT s u m a
                                                                  = ParsecT {unParser :: forall b .
                                                                               State s u
                                                                            -> (a -> State s u -> ParseError -> m b) -- consumed ok
                                                                            -> (ParseError -> m b)                   -- consumed err
                                                                            -> (a -> State s u -> ParseError -> m b) -- empty ok
                                                                            -> (ParseError -> m b)                   -- empty err
                                                                            -> m b
                                                                           }
                                                              
                                                              1. 1

                                                                Along the lines of the malformed class name error, is a similar restriction around resource names. I found out the hard way recently that findResource won’t find resources named, say, bundle.js.map. This is because bundle.js.map does not follow the format of name.extension that is expected. The crazy bit is that you can still get at those resource files (they do end up in the JAR), just not with the conventional methods.

                                                                1. 1

                                                                  Have you checked how it behaves on newer JVMs?

                                                                1. 12

                                                                  This isn’t a bad post, but beyond giving a very rudimentary overview of BFS there doesn’t seem to be much in the article demonstrating Go’s standard library. You could do this same exercise in virtually any language that has list and map types in its standard library and the code would look the same.

                                                                  Serious question: do Go users actually re-use the standard list type everywhere instead of having proper types for stacks and queues?

                                                                  1. 3

                                                                    Serious question: do Go users actually re-use the standard list type everywhere instead of having proper types for stacks and queues?

                                                                    If the list type handles both and you can just pick which methods to call in your use case, why would that be an issue? Stacks and queues are very similar internally… You can sometimes see a very simple implementation of them just using slice access really.

                                                                    1. 3

                                                                      If the list type handles both and you can just pick which methods to call in your use case, why would that be an issue?

                                                                      No fundamental issue. I often rely on type signatures to help understand what I’m dealing with, and it is also sometimes helpful to get a compile error (ie. if I try to use as a stack a list that was intended to represent a queue). I guess folks using Go probably rely on comments explicitly indicating whether the list is a stack or queue (and helpfully named variables).

                                                                      1. 2

                                                                        From what I’ve seen (using go since 2012) people don’t usually pass around a slice expecting you to push or pop.

                                                                        If you are using a slice as a queue you make one of two choices: option 1 is write a wrapper type. Option 2 is reuse some wrapper type built on the empty interface. Soon we’ll get a third option with go2 generics.

                                                                        1. 2

                                                                          Did they decide to add generics in Go 2? That would’ve been pretty big news.

                                                                          1. 2

                                                                            Here’s one proposal from the core go team: https://go.googlesource.com/proposal/+/master/design/go2draft-contracts.md

                                                                            I don’t know which release it will land in, but go getting generics is a question of when not if.

                                                                            1. 1

                                                                              I feel like if they had decided that they will implement generics eventually, they would’ve told us. Seems possible that they could explore the design space and decide there is no possible design they are satisfied with.

                                                                    2. 2

                                                                      Yes, you are right, I’m not diving deep into BFS. I actually said in the post there was plenty of pretty awesome material online. I wanted to demonstrate the power of standard library. You must surely admit having a simple BFS implementation on ~15 lines of code is pretty cool :-)

                                                                      do Go users actually re-use the standard list type everywhere instead of having proper types for stacks and queues

                                                                      I dont personally do this all the time, but I often tend to start with it, just to get things going and then gradually improve on and sometimes eventually completely replacing some of the standard library data structure implementations.

                                                                      1. 0

                                                                        Why not? A stack is just a vertical list.

                                                                        1. 2

                                                                          No, a stack is not a vertical list. A stack is a LIFO. If you’re arbitrarily indexing into a stack, it’s a list.

                                                                          1. -1

                                                                            That’s picking nits, and not really useful ones. The stack, as in the C/hardware stack, is pushed to on subroutine call and indexed into throughout the subroutine. It’s hard to argue that it’s not a stack.

                                                                            1. 3

                                                                              It’s not a nitpick. “A stack is just a vertical list” is nonsense, else we’d just call it a list.

                                                                              1. 1

                                                                                Have you ever tried physically rotating a data structure? Of course it’s nonsense! :)

                                                                                Anyway, it depends on how restrictively you define a stack. Is it just nil, push and pop? What about rot, tuck and roll? I’d say nil, cons, car and cdr form both a list and a stack minimally and adequately. Both data structures have a head and a tail, and arbitrary indexing is O(n), so it’s just a matter of perspective.

                                                                                Also you can arbitrarily index into a mutable stack non-destructively if you keep another stack around to form a zip list.

                                                                                1. 1

                                                                                  A stack doesn’t really have a tail, if you’re pushing and popping the “tail” it’s a list or dequeue. rot is not an operation that cannot be performed on a stack, it requires some kind of external storage; in Forth, >r swap r> swap, and swap too requires some external storage. Same for tuck. roll is not a real stack operation, it’s treating a stack like a list and indexing arbitrarily. Names for these data structures exist for a reason. A stack is not a vertical list because the point of a stack is you push and pop the top item. If you wanted to randomly index into some data structure that might be appended or truncated, the first data structure to come to mind would not be a stack, because that isn’t the point in a stack.

                                                                                  1. 1

                                                                                    These new external storage requirements aside (even pop requires some external storage), I’m not arguing what things are for, I’m saying what they are. Absolutely you wouldn’t arbitrarily index into a stack, but if you take an index-able data structure like a list (not that cdddr is any different to pop pop pop) and treat it like a stack, you’ve got yourself a stack. You wouldn’t reimplement a list type just to remove a method so you can rename it to “stack.”

                                                                      1. 10

                                                                        It is a straw-man argument to say that macros in LISP-like languages are amazing because you don’t need to generate strings and eval them. Most macros systems I’ve used (in Scala, Haskell, and Rust) work over ASTs instead of strings. In some cases (typed TH), you even get to reason about the type of the expressions represented by your ASTs. Even LISP quoting/unquoting facilities have their analogue in other languages with quasi quotation.

                                                                        I view extra syntax as having a similar benefit as syntax highlighting: it enables my eyes to quickly navigate to specific parts of the code (branching constructs, type signatures, side effects, etc.). In both cases, there’s obviously a balance for everyone.

                                                                        1. 6

                                                                          The point about macros was mostly in comparison with languages that don’t have them at all. I haven’t done much macro work in any language, probably the most in Clojure, a bit in Haskell, and I guess I’ve done template programming in C, if you want to count that, so I’m not really the right person to ask when it comes to the tradeoffs of different the macro systems out there. I’m also very torn on how I feel about them, because in many languages they end up being a pain to debug, so I mostly stay clear of them, but I think they can provide a lot of value to library authors.

                                                                          I like that argument about syntax highlighting/visual distinction, that is a good point.

                                                                        1. 7

                                                                          I recently also implemented a neural net in Rust targeting the MNIST dataset (https://github.com/harpocrates/rust-mnist) while working through http://neuralnetworksanddeeplearning.com. The main surprise I had was about how nice the dependency management story was.

                                                                          • Using ndarray (instead of Vec), it was 5LOC to add a compile time feature flag to enable using the GPU (which then resulted in ~35% speedup)
                                                                          • Adding a web UI (for interactively identifying digits drawn on an HTML5 canvas) took 100LOC of Rust with the simple-server crate

                                                                          I found both these libraries by searching through https://crates.io/ and figured out how to use them by looking at the automatically generated docs.

                                                                          1. 18

                                                                            For the love of God, do we have to label 50 year old software methods with some misused term from Category Theory?

                                                                            1. 22

                                                                              The term “monad” does apply here though, albeit without it’s full power. We should be glad to see that old techniques are being understood using another angle. Understanding monads has helped me see lots of patterns in code I write every day. It’s not merely a useless theoretical construct.

                                                                              1. 6

                                                                                I’m interested in hearing about what insights you got from “monad”. To me, it’s just pretentious.

                                                                                1. 10

                                                                                  For me, monads are a concept that lets me think about computation that carries baggage from one statement to the next. The baggage can be errors, a custom state object, or context related to IO or async computations. Different things, similar way it looks in code. Recognizing the monad concept helps me understand the contours of the implementation and also helps me generate better designs. It goes both ways. I do think there’s value in using the word. It’s a way to compact thought.

                                                                                  1. 6

                                                                                    You may have reasons to dislike Haskell, but that shouldn’t mean you need to deprive yourself of the benefits of these fundamental concepts. Take the signatures of the following functions:

                                                                                    ($)      ::                                     (a -> b) ->   a ->   b
                                                                                    (<$>)    ::                      Functor f =>   (a -> b) -> f a -> f b
                                                                                    (<*>)    ::                  Applicative f => f (a -> b) -> f a -> f b
                                                                                    (=<<)    ::                        Monad f => (a -> f b) -> f a -> f b
                                                                                    foldMap  ::         (Foldable t, Monoid m) => (a -> m  ) -> t a -> m
                                                                                    foldMap  ::     (Foldable t, Monoid (f b)) => (a -> f b) -> t a -> f b
                                                                                    traverse :: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b)
                                                                                    

                                                                                    These are all different ways of combining a function that may have some super powers (in a functor context, producing a functor context etc.) with some value that may have some super powers and obtain something with super powers. And note how you can combine them together to build a composite value. By the way, there are infinitely many like these, and new useful ones are discovered all the time.

                                                                                    Having an intuition for these combinators combined with a knowledge of common types that satisfy the Functor, Applicative, Monad, Foldable, Monoid and Traversable helps you write reusable code. Instead of tying down the implementation to a specific combination of concepts, you recognize the actual pattern the first time, so your code becomes easier to understand* and reusable the first time with less typing!

                                                                                    (*) Easier to understand for someone who has an intuition for these concepts. Because if you see a function with signature Applicative f => f A -> B -> f C, you immediately know that the function may decide to invoke the f effect you gave it (f A) a number of times and it may base that decision on B and it may also use your A and B to construct the C, but it’s impossible for it to use the A to decide on how to invoke the effects of f.

                                                                                    Reading tips:

                                                                                    • a :: b: a has type b
                                                                                    • a -> b: a function from a to b
                                                                                    • f a: Type constructor f (for instance List) applied to type a (to produce “list of a”)
                                                                                    • a -> b -> c: Same as a -> (b -> c): a function that takes a and produces another function that will take b to produce c, a two-argument function in practice.
                                                                                    • c a => ... a ...: a is any type that has an instance (implementation) of the typeclass (interface) c
                                                                                    1. 5

                                                                                      most people don’t know this but the haskell version of monad is a crude simplification of the category theory definition of monad. it’s a lot less general.

                                                                                      1. 4

                                                                                        It’s always possible that I am wrong, but to me, “monad” as used in Haskell is partly just a pompous way of describing well known and trivial programming paterns and partly a painful effort to permit the absolutely necessary use of state variables within a system based on the incorrect theory that one should program statelessly.

                                                                                        http://www.yodaiken.com/2016/12/22/computer-science-as-a-scholarly-discipline/

                                                                                        1. 11

                                                                                          Full disclosure, I’m a monad-lovin’ fool, but to me the utility of the concept is to see that some (yes) well-known and trivial programming patterns are in fact conceptually the “same thing”. It unifies ideas, that many programmers only learn by convention, underneath a single concept which has more than just the weight of convention behind it.

                                                                                          It’s totally possible to go overboard with it, or to over-rely on the flimsy “mathiness” of FP as a way of dismissing or trivializing real problems, but imo if you want a bedrock understanding of programming patterns that is based in something factual (as opposed to flavor-of-the-month tribalism or appeals to the current authority), FP is the way to go.

                                                                                          1. 5

                                                                                            I’m with you on the pompous and unenlightening terminology, but that’s a pretty unfair characterization of the role monads play in demarcating state-dependent code.

                                                                                            1. 4

                                                                                              I think there are a couple of real insights in FP, but the “math” continues the basic CS confusion between meta-mathematics/axiomatic and working mathematics - which is often about processes and algorithms and state. And the project falls for a common CS error, in which we note that doing X is hard, messy, error prone, so we create a system which doesn’t do X because that’s clean/light/well-defined “easy to reason about” (even if not in practice) and then we discover that X is essential, so we smuggle it back in awkwardly. I just don’t see the insight that the Category theory brings to understanding state.

                                                                                              1. 4

                                                                                                My experience with describing stateful systems is that it’s much more elegant and bug-free in pure languages because you get to reason about it explicitly. I don’t see many imperative languages letting me statically prove propositions about the system state (“this resource exists in that state at that point in the code”). You need equational reasoning and powerful type systems for that. Monads help.

                                                                                                The point of purity isn’t about avoiding state and mutability, it’s about avoiding implicit state.

                                                                                                1. 2

                                                                                                  That’s great. Do you have a link to a paper/post which works out a proof of some non-trivial program?

                                                                                                  1. 1

                                                                                                    The most complex I’ve seen is probably the state management in my video game, where it ensures proper initialization, resource management and deallocation. It’s very simple on the user side, it consists only in writing and complying with the state transition specifications in the types, e.g. quitRenderer : (r : Var) -> ST m () [remove SRenderer r]. This is a method in the Draw interface whose type says that it removes the renderer from the context m. SRenderer is some Type defined by the implementation, in my case a more complex composite type, and if I were to implement that method without properly handling those subcomponents, it wouldn’t compile. There is much more to the “complying” part though. ST tracks your state as you further develop your function, and you just print it out. It’s like a static debugger. You see how your states evolve as you unwrap and modify them without even running your code, and I cannot how rarely additional changes after you test are needed compared to other languages.

                                                                                                    I haven’t worked on other serious Idris projects, I’ve just done the book exercises and went to code my game. I definitely recommend the book though.

                                                                                                2. 1

                                                                                                  Yeah, I’m with you on the category theory, that could easily be obfuscatory math-envy (still looking for an example of a software situation where category-theory concepts carry more than their weight.) But separating out X as much as possible, and making it clear where X has to be mixed back in, that’s just modularization, usually a good policy.

                                                                                                  1. 0

                                                                                                    But modularization is hard and where module boundaries don’t work, creating a backdoor often just makes things less modular. To me, what would be more interesting than “pure functions” plus escape is to clarify exactly what, if any, state dependencies exist. I don’t want to bury e.g. file system pointer dependency in some black box, I want to know f(fd,y,z) depends on fd.seekpointer internally and fd.file contents externally, or, for a simpler example that f(x) has a dependency on time-of-day. I don’t see much serious work on trying to expose and clarify these dependencies.

                                                                                                    1. 4

                                                                                                      That’s exactly the role monads are supposed to play in Haskell.

                                                                                              2. 1

                                                                                                I don’t have much problem with naming the idea of monads, though I definitely agree that pure functional programming is not a good way to program.

                                                                                                1. 1

                                                                                                  I read your linked post. Since you don’t provide commentary, it’s hard to understand what your “Oh” is meant to imply. I’m guessing that you’re trying to highlight a contradiction between “Why functional programming matters” and the “Unix paper”. Could you explain your idea here?

                                                                                                  1. 1

                                                                                                    Hughes describes as a novel advance from FP something that was well known a decade earlier in systems programming - and described in one of the most famous CS papers ever published. The concept of programs streaming data over a channel, of course, also predates UNIX which is why Dennis Ritchie, who did know the literature did not make a claim as ignorant as Hughes’ - or the one in the paper discussed here.

                                                                                                    1. 2

                                                                                                      Hmm. I don’t know the literature either. Your critique that much of the first paragraph from “Why functional programming matters” quoted in your blog post isn’t novel in light of the “Unix paper” from more than a decade earlier seems true, given the context in your post.

                                                                                                      That said, there are some novel things which lazy-evaluation brings to the table. Hughes hints at them in the section that you quoted:

                                                                                                      1. It makes it practical to modularize a program as a generator that constructs a large number of possible answers, and a selector that chooses the appropriate one.

                                                                                                      2. You can take this further: The “possible answers” may be large data structures which are expensive to compute, and lazy evaluation will only compute enough of the data structure to determine whether you want to use it, leaving the rest unevaluated when the “selector” rejects them.

                                                                                                      Strict evaluation doesn’t give you either of these advantages unless explicitly constructed. You can get #1 with blocking semantics on reading & writing unix pipes, like Ritchie showed. You probably can’t get #2 in a general way without lazy evaluation.

                                                                                          2. 19

                                                                                            misused term

                                                                                            Can we please stop fighting jargon? Do other industries have people constantly calling for things to be named differently? Like, does the Fashion Industry have people fighting to rename “balaclava” to “winter weather knitted headwear” because it’s headwear named after a Battle, and not really descriptive?

                                                                                            If not “monad,” what’s a better alternative?

                                                                                            1. 8

                                                                                              It seems pretty clear to me that the purpose of this post is to shame rob pike, people already think he’s too stupid to understand the hindley-milner type system so this just adds onto that.

                                                                                              1. 6

                                                                                                I don’t think the post shames Rob Pike. I think it’s meant to point out a pattern that shows up in the go stdlib also shows up elsewhere.

                                                                                              2. 2

                                                                                                I think in the pursuit of knowledge it makes sense to name & build understanding around the things that we already do, or observe nature doing. That’s what applying a “misused term” to “50 year old software methods” accomplishes.

                                                                                                1. 2

                                                                                                  I don’t think jargon confers knowlege.

                                                                                                  1. 1

                                                                                                    I’d rather we name concepts than ceaselessly describe them in imprecise terms. That’s what the article is doing, and it serves the function that people can understand the pattern if they’ve already used it elsewhere, that communication is clearer because it is more concise, and that mistakes can easily be identified.

                                                                                                    So, while jargon might not confer knowledge, it enables communication and facilitates thought.

                                                                                                    1. 1

                                                                                                      To me, there is nothing precise about the concept of monad as used in FP. It’s a collection of messy hacks. The Moggi paper is a travesty of Category Theory.

                                                                                                2. 1

                                                                                                  You really need to relax.

                                                                                                1. 4

                                                                                                  This article misses one of the core differences between ByteBuffer and ByteArrayInputStream/ByteArrayOutputStream: the former is not thread safe while the latter is. Granted, when you are deserializing, you are usually reading from the buffer from only one thread.

                                                                                                  It is a also a bit disappointing that there is no ByteBuffer-backed structure (also not thread safe and also fast) that dynamically allocates new buffers to support writing in data whose final size you don’t yet know. Sure, I could roll my own (or use one of the many half-baked library implementations), but given that the JDK supports ConcurrentSkipListSet, surely it has space for this simple addition?

                                                                                                  1. 1

                                                                                                    Oh! That could explain a lot. Taking a mutex costs about ~15ns on contemporary CPUs, even in the completely happy zero contention path (since that is about how long an uncontended lock cmpxchg takes, last time I benchmarked it on a single-socket dual core Intel laptop from about 2017ish).

                                                                                                    Honestly, “thread safe byte stream” sounds like a really silly API to me. If two different threads are pulling bytes out of the same file descriptor, it’s almost completely implausible that they won’t tread on each others’ toes. Even with a mutex guarding their read() calls so that each attempt to get a given number of bytes forms a critical section (even if it happens to involve multiple underlying read(2) syscalls), the only plausible data formats where you wouldn’t cause confusion would have to be ones where each record is fixed-size (so you never get one thread causing another to read halfway through a record) and the ordering of the records doesn’t matter at all.

                                                                                                  1. 2

                                                                                                    Great article!

                                                                                                    It’s a bit crummy that optimizations in GHC are so hit and miss. We really need a good story for reliable optimizations. As the author notes, it is way too easy for your carefully tuned code to suddenly fall off the good path due to some change in GHC. I wish there were some way to at least say: “don’t compile unless this optimization manages to fire!”. Same goes for hackery around rewrite rules.

                                                                                                    1. 3

                                                                                                      I wish there were some way to at least say: “don’t compile unless this optimization manages to fire!”. Same goes for hackery around rewrite rules.

                                                                                                      That’s actually possible! There’s a recent package called inspection-testing which (with the help of a GHC plugin) allows testing certain properties of an expression’s compiled form, e.g. that calling some high-level API in a certain way is equal to some particular optimised implementation. AFAIK, since it runs at compile time, it will bail out upon failure, like you ask :)

                                                                                                      I’ve also come across other testing libraries for non-functional properties (i.e. things other than input/output behaviour), e.g. http://hackage.haskell.org/package/StrictCheck

                                                                                                      It’s also feasible to check this sort of thing with benchmarks (e.g. criterion, weigh or bench). Projects like asv (which I’ve used for Haskell via the asv-nix plugin) can track performance over commits and look for step-changes, although that requires a reasonably stable environment to keep timings consistent. It might be more robust to just check the performance of multiple implementations, e.g. that implementation X takes between 1.2 and 1.5 times as long as hard-optimised implementation Y, and between 0.1 and 0.2 times as long as naive implementation Z.

                                                                                                      1. 1

                                                                                                        If I’ve got time to make a hard-optimised implementation which I know won’t regress, that’s what I’ll use!

                                                                                                        I agree that tracking performance over time is a good technique, if sometimes a bit tricky to set up properly. GHC itself has a cute approach of storing performance metrics inside of git notes (all as part of its CI process).

                                                                                                        That’s not to say that GHC plugin backed solutions aren’t pretty neat too. I forget the particulars, but I know the generic-lens package has some neat testing to check that generically-derived lens are as efficient (meaning all the generic code inlines properly) as manually written ones.

                                                                                                        1. 1

                                                                                                          If I’ve got time to make a hard-optimised implementation which I know won’t regress, that’s what I’ll use!

                                                                                                          I meant something more like the blog post, which compares against alternatives written in C and Rust, which turned out to be unusable in their project for different reasons. I also like the blog’s use of a dummy implementation to provide a lower bound, which we could do if there are no suitable alternatives to benchmark against.

                                                                                                          Also, just in case I was misunderstood about inspection-testing, the idea there is that we compare a general solution called with some known arguments, against a non-solution which just-so-happens to hard-code those arguments. For example, if we want to check that the function prefixEachWith will unroll/inline the recursion on its first argument, we can compare:

                                                                                                          lhs = prefixEachWith ["foo", "bar", "baz"]
                                                                                                          
                                                                                                          rhs = map (\x -> [x:"foo", x:"bar", x:"baz"])
                                                                                                          

                                                                                                          The rhs is optimised, in a way that won’t regress, but only for this particular test. It isn’t a general solution to the problem.

                                                                                                          GHC itself has a cute approach of storing performance metrics inside of git notes (all as part of its CI process).

                                                                                                          It’s a nice idea, but unfortunately the only time I see it come up is when the allowed boundaries on the timings get bumped to stop spurious-seeming build failures :( I think tracking benchmarks over time is good for human perusal, and maybe some warnings, but not worth aborting a build over. Benchmarking different implementations (perhaps in different languages, perhaps dummies) is less fragile, and perhaps worth aborting for, since we’re measuring relative performance, which should cancel out many aspects of the particular machine being used.