Mitigation not listed: use a base64 decoder that errors if the padding bits are not all zero, instead of ignoring them. Then there will be only one valid encode for a given payload.
By odd coincidence I just ran into this on Thursday, while implementing d64, a base64 variant with a different alphabet.
The README says:
For characters which overhang the byte array (i.e. the last character if length % 4 is 2 or 3) the overhanging portion must encode off bits.
After I figured out what this meant, I tweaked my C++ decoder to reject the input if the “carry” bits end up nonzero at the end. I’m glad I did this because it mitigates the issue you describe.
Oddly, the reference JS implementation of d64 in the repo doesn’t perform this check.
That is a surprise. I’ve only used base64 as the final step in encoding data for transport over protocols or formats that don’t support binary data. And there are a lot of them… XML being fairly notable for not being enable to encode NUL.
XML can encode NUL, different protocols may do it differently though. There is no one “XML encoding”.
I can encode NUL as base64-text-rasterized-in-a-png, too.
There’s a difference between “you can build an encoding scheme in this language which can represent NUL” and “the language has explicit support for this feature, and it works well”.
Mitigation not listed: use a base64 decoder that errors if the padding bits are not all zero, instead of ignoring them. Then there will be only one valid encode for a given payload.
If anyone wants to do more research, this is called “canonical base64”.
Great suggestion! I’ll add it.
By odd coincidence I just ran into this on Thursday, while implementing d64, a base64 variant with a different alphabet.
The README says:
After I figured out what this meant, I tweaked my C++ decoder to reject the input if the “carry” bits end up nonzero at the end. I’m glad I did this because it mitigates the issue you describe.
Oddly, the reference JS implementation of d64 in the repo doesn’t perform this check.
That is a surprise. I’ve only used base64 as the final step in encoding data for transport over protocols or formats that don’t support binary data. And there are a lot of them… XML being fairly notable for not being enable to encode NUL.
XML can encode NUL, different protocols may do it differently though. There is no one “XML encoding”.
I can encode NUL as base64-text-rasterized-in-a-png, too.
There’s a difference between “you can build an encoding scheme in this language which can represent NUL” and “the language has explicit support for this feature, and it works well”.
Does XML accept a numeric character reference for NUL, like this:
�
?https://en.m.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
I usually generate my secrets as multiples of 3 octets when they are to be base64-encoded to avoid this and to avoid wasting space on the padding.
Another fun surprise is that base64 does not preserve sort order. E.g. 0x000033 < 0x000034 but AAAz > AAA0
Seems like a fun way to do some steganography - I hope some CTF makers are paying attention.
Oooh! Hiding bits among the padding across a few hundred messages, repeating as they get scraped from the server.
What a fun suggestion!