1. 17
  1. 13

    for the typical images where I would use PNG, QOI is around 3x larger, and some outliers are far worse. … However, as I said, QOI’s strength is its trade-off sweet spot. The specification is one page, and an experienced developer can write a complete implementation from scratch in a single sitting. My own implementation is about 100 lines of libc-free C for each of the encoder and decoder.

    I’m not sure when “I can write my own codec in 200LOC” would be a selling point for an image format, unless of course you just happen to love writing image codecs. Every platform you could ever consider using already has PNG support at a net cost to you of basically zero LOC.

    1. 4

      Reading some of the author’s other posts, I get the sense that “implementable from scratch” is one of his values, as opposed to a means to some other end. For example, this article where he talks about implementing hash tables from scratch, or his showcase of self-contained C programs that output video.

      Being able to use other people’s code is a huge boon to productivity, but I can also appreciate the art of learning to do things yourself (provided that that is the intention, of course).

      1. 4

        I would assume PNG is significantly slower to decode though? Depending on the kind of game, you may have to very quickly stream assets in from disk, and the time it takes for the CPU to decide a PNG might end up being critical.

        The way he talks about QOI as an alternative to run-length encoding makes me think decoding speed is truly one of the main objectives here.

        At least that’s the rational explanation I can come up with. It’s obviously also possible that it’s just a desire to implement everything from scratch for no rational reason. Which is also completely fine.

        EDIT: To confirm, the QOI web page does indeed claim that it’s 3-4x faster to decode than PNG.

        1. 4

          For modern games it’s probably better to use something like Basis Universal which gives you a compressed texture format. You get a better on-disk compression ratio and use less VRAM.

          1. 3

            Basis Universal

            First time hearing about that. What a treat, thx for the reference! I was already kinda bummed by complex code paths to support both ETC2 on android and DXT / S3CT on computers.

            1. 1

              Basis Universal looks like a good choice, but I haven’t done enough research to judge its pros and cons. It’s worth noting that it does rely on extensions, so you can’t use it with standard OpenGL, and last time I looked at the world of compressed textures, it seemed like an absolute patent mine field.

              I’m almost certain that a modern NVMe drive can read a PNG much faster than a CPU can decode it. But I’m not going to look at the numbers and do the numbers right now, and I’m open to be proven wrong. (We have to remember though that the CPU has a lot of other stuff to do while playing a game, not all of it can be dedicated to decoding PNGs.)

              1. 3

                Basis got standardized. If you’re using compressed textures that’s definitely better than loading QOI and then doing the hard work of compressing the textures.

                For now I’d bet that games are bottlenecked on single-core performance, and can find spare cores for decompression of something better, at least LZ4 if not zstd. When you can live with the low compression ratio of QOI, you can’t have enormous amounts of data to load in the first place.

                But for really large worlds that need to be loaded in real-time, the future is DirectStorage and friends. Weirdly enough, DirectStorage supports hardware-accelerated DEFLATE so you can off-load PNGs, but not QOI. PS5 uses Kraken, which beats both on compression ratio.

          2. 2

            I suspect that this may be different if the context is textures because you can get some big wins from being able to write the decoder as a small GPU kernel, load the compressed images there, decode them, and recompress them with whatever the specific GPU has hardware support for. That said, I thought OpenGL defined some interchange formats these days that most GPUs either had hardware support for transcoding or at least had off-the-shelf kernels as part of their driver stack.

          3. 5

            Simple/fast compressors may seem interesting when you look at a singular metric like code size or raw decode speed, but the compressors need to be evaluated as part of a system, e.g.:

            • If you have many MBs or GBs of assets, then compressing them even slightly less will cost you way more disk space than a few KB extra of decompressor code.

            • faster decode at cost of increased disk I/O may be a net negative for overall loading time. Especially if you have many separate assets that can be loaded in parallel, then time to load them isn’t serial disk + decode time where decode matters, but rather max(disk, decode) where decode time is completely masked by I/O time and irrelevant as long as it not slower than disk.

            1. 1

              Sooo how does it compare to https://tools.suckless.org/farbfeld/ ? The author mentions it but doesn’t go into detail… Why are they more excited than what exists in this space?

              1. 2

                From the look of it, QOI generates smaller files without much more complexity. It has some basic RLE compression and some basic difference encoding for storing a pixel as “previous pixel +/- some small numbers”, which fits into 1-2 bytes instead of 4.