1. 10

  2. 11

    I’m sorry, but this is not how you write micro-benchmarks. In this case, the methodology does end up affecting the results: when you fix that, bitfields don’t end up being nearly so much faster. The author is also clearly biased, and seems more concerned about reaching the conclusions that they want than being correct (seeing 0.000000057 should cause them to question their entire benchmark process, not to conclude that bitfields are obviously much better).

    Rather than just complain, I ported this benchmark to JMH (full code here - I’m not a JMH expert, so please point out flaws) and the numbers I got indicate something much more reasonable:

    [info] Bench.enumSet  avgt  100   6.605 ± 0.173  ns/op
    [info] Bench.hashSet  avgt  100  83.031 ± 9.414  ns/op
    [info] Bench.maskSet  avgt  100   4.200 ± 0.123  ns/op

    In case you are wondering, the more than half of the hashSet time is actually just creating the hash set.

    That really brings me to the final point: this benchmark tells us very little about HashSet, EnumSet, or bit flags. It doesn’t mix in multiple checks (you should probably benchmark groups of operations so the JIT doesn’t overfit in a way it couldn’t in a real example) and it doesn’t check different operations.

    1. 6

      I read through the corresponding Java source (EnumSet, RegularEnumSet, JumboEnumSet), and it seems that the main extra work done by java.util.EnumSet is bounds-checking. Additionally, more than 64 fields per enum are supported, although that code path is probably not entered by the benchmark. It is good to know the costs of safety and feature-completeness.

      1. 5

        It is good to know the costs of safety and feature-completeness.

        Is that cost unavoidable though? Ada’s representation clauses are just as safe and feature-complete (I’d argue even more feature complete, as you can have parts of your bitfields span multiple bits!) while being more performant. Take a look at the examples here if you’re curious: https://docs.adacore.com/gnat_rm-docs/html/gnat_rm/gnat_rm/representation_clauses_and_pragmas.html#record-representation-clauses

        1. 2

          The cost is not nearly as high as the blog suggests (probably less than a 2x factor in general). Yes, it isn’t trivial, but it isn’t orders of magnitudes. The OP is just not benchmarking properly (see the other comments for more). Unless the codepaths for more than 64 fields are exercised, the JIT will pretty quickly realize those are unlikely (and optimize accordingly).

          Very few JVM abstractions are “zero cost” (the way you could argue they often are in Rust), but OTOH the JVM does a pretty good job of optimizing at runtime based on your actual workload.

        2. 5

          Without doing a bit of benchmarking, I’d assume that EnumSet will be slower than a bitfield. I would pick a bitfield in a case I thought it might matter.

          However, the benchmark in this post is bad. Let’s start with the small stuff: the benchmarks use straight-line loops, the code all three benchmarks in sequence in the same JVM[0], and the results are presented without units.

          The kicker is that the author notices that the JVM has optimized away one of his benchmark cases, and argues that this is fine, because it means that EnumSet “inhibits optimizations.” That’s not valid. It means that they inhibit dead code elimination in this particular case. Your actual code using bitfields will typically not be dead, so why measure that?

          [0] I can’t say whether the same JVM is a problem in this particular case, but it’s a bad practice in general.

          1. 5

            Those benchmarks are rather questionable. System.gc() really? If you really must write a micro-benchmark, then please use JMH the de-facto accepted standard for those in java.: https://github.com/openjdk/jmh