Neat, learned about some new techniques I didn’t know about.
I’ve spent some time reversing custom/proprietary compression algorithms found in various video games, and usually they’re just LZSS variants with some difference in how it’s encoded exactly (how big the literal-or-backref bitsets are and how backrefs are encoded).
Once, however, I came across something quite different: byte pair encoding. The general idea is to replace common pairs of bytes with unused bytes repeatedly to compress, and then store the final compressed data plus a list of substitutions done, and perform those substitutions in reverse to decompress. I thought I’d mention it since it wasn’t included in the linked-to book, but I don’t know how useful it is in practice (one huge drawback is that you can’t stream BPE). LZ-based approaches certainly seem to have won the lossless compression war.
Neat, learned about some new techniques I didn’t know about.
I’ve spent some time reversing custom/proprietary compression algorithms found in various video games, and usually they’re just LZSS variants with some difference in how it’s encoded exactly (how big the literal-or-backref bitsets are and how backrefs are encoded).
Once, however, I came across something quite different: byte pair encoding. The general idea is to replace common pairs of bytes with unused bytes repeatedly to compress, and then store the final compressed data plus a list of substitutions done, and perform those substitutions in reverse to decompress. I thought I’d mention it since it wasn’t included in the linked-to book, but I don’t know how useful it is in practice (one huge drawback is that you can’t stream BPE). LZ-based approaches certainly seem to have won the lossless compression war.