This is a nit not actually related to the content (which was very interesting!) but… I wish this author did not use “obviously” so much. “Everybody knows that JPEGS are 8x8…” actually, no. Even among programmers that’s relatively specialized knowledge, not to mention non-technical folks.
Again, this was actually very interesting, and I’m glad to have learned something! But it made me feel ignorant instead of excited to learn, and that’s a bummer in my book.
These tweets would be far more interesting to people with passing knowledge of image formats, such as myself, if they included the “obvious” things that “everyone” already knows. Because I don’t!
Hah, I thought you were going to link to Ten Thousand. But that one’s good too :P
Interesting, the same line made me go “huh, that makes some sense” and I moved on.
Good observation, but the “vertical slices” mentioned in the tweet aren’t a thing.
JPEG simply has no features for dealing with repetitive patterns. 8x8 blocks are coded in a largely independent way. They encode overall brightness (DC coefficient) as a difference from their neighbor on the left, and all share the same Huffman table which may improve average compression ratio if the pattern is always neatly aligned to block boundaries. But apart from that each 8x8 block is almost like a separate image.
PNG wins here because DEFLATE can simply say “copy pixels from over there”. That’s an obvious win on synthetic graphics, and conversely it’s almost never applicable in imperfect, noisy photos of the real world.
Right, it’s more on the content. JPEG handles noisy pictures well, and png is better at lossless screenshots.
Tweet author here. I didn’t want to dive into too technical details here (mostly because it was a relatively high level observation), but the vertical slices I’m referring to are the FF Dx chunks, each representing a completely independent image, usually 8 or 16 pixels high. You can experiment with it by modifying such chunk (or even deleting or truncating it!) in a JPEG and see how the corruption only affects a single 8 or 16 pixels tall slice (This also means the Huffman table “resets” per chunk/slice). This lead me into realizing that no matter how good in-chunk coding would be, it simply can’t make use of any shared features between two different chunks, which further lead me into realizing how easy it is to craft an extremely compact PNG that JPEG simply can’t reduce in size.
The end conclusion was also not “PNGs are better than JPEGs”, but rather “misusing JPEGs for non-photographic content is even worse than one would usually think”.
The 16px you see is from chroma subsampling. It’s still an 8x8 block, but pixels of chroma channels are upscaled when decoding.
There are no vertical slices. Blocks are arranged left-to-right (horizontally) wrapping around in lines (in progressive JPEG there are layers, but blocks within a layer are still left-to-right lines).
If you see a vertical slice, it’s only because if you mangle compression then it either skips some blocks or decodes some blocks which shouldn’t be there, and this shifts all remaining blocks the image left or right (like adding or removing words in word-wrapped text). Typically this makes edge of the image wrap around and become visible.
FF Dx is a restart marker. It’s a data-stream-only feature, with no visual meaning. It’s an optional mechanism for limiting effects of damage in the file. Loss of a single bit in Huffman-coded data makes entire rest of the stream garbage — unless you have a marker that lets you restart from a known state. These markers are also intentionally put where compressed data would have accidentally created a pattern that looks like a marker.
Because usually png is bigger for photos?