    Here the benchmarks are “random series of bytes”, but I wonder what the perf looks like for “mostly one kind of character”. I think that for example, a lot of UTF-8-formatted code is going to be mostly single-byte, where-as a lot of CJK-y text will be like… 20% single-byte, 80% double-byte.