1. 40
  1. 18

    Mitigation not listed: use a base64 decoder that errors if the padding bits are not all zero, instead of ignoring them. Then there will be only one valid encode for a given payload.

    1. 5

      Then there will be only one valid encode for a given payload.

      If anyone wants to do more research, this is called “canonical base64”.

      1. 4

        Great suggestion! I’ll add it.

      2. 4

        By odd coincidence I just ran into this on Thursday, while implementing d64, a base64 variant with a different alphabet.

        The README says:

        For characters which overhang the byte array (i.e. the last character if length % 4 is 2 or 3) the overhanging portion must encode off bits.

        After I figured out what this meant, I tweaked my C++ decoder to reject the input if the “carry” bits end up nonzero at the end. I’m glad I did this because it mitigates the issue you describe.

        Oddly, the reference JS implementation of d64 in the repo doesn’t perform this check.

        1. 3

          That is a surprise. I’ve only used base64 as the final step in encoding data for transport over protocols or formats that don’t support binary data. And there are a lot of them… XML being fairly notable for not being enable to encode NUL.

          1. 1

            XML can encode NUL, different protocols may do it differently though. There is no one “XML encoding”.

            1. 2

              XML can encode NUL, different protocols may do it differently though. There is no one “XML encoding”.

              I can encode NUL as base64-text-rasterized-in-a-png, too.

              There’s a difference between “you can build an encoding scheme in this language which can represent NUL” and “the language has explicit support for this feature, and it works well”.

              1. 1

                Does XML accept a numeric character reference for NUL, like this: � ?

                https://en.m.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

            2. 1

              I usually generate my secrets as multiples of 3 octets when they are to be base64-encoded to avoid this and to avoid wasting space on the padding.

              Another fun surprise is that base64 does not preserve sort order. E.g. 0x000033 < 0x000034 but AAAz > AAA0

              1. 1

                Seems like a fun way to do some steganography - I hope some CTF makers are paying attention.

                1. 2

                  Oooh! Hiding bits among the padding across a few hundred messages, repeating as they get scraped from the server.

                  What a fun suggestion!