1. 6

  2. 4

    What I find interesting here is it’s an instance of a pretty classic security vuln. One function calculates how much memory is needed using one formula, then another function uses that memory with a different formula.

    Many years ago I found the exact same bug in mono. When attempting to convert invalid utf-8 to latin-1, the length() function skipped invalid characters, but the encode() function output ‘?’. (Roughly, I recall there was some other wrinkle.)

    Some years later when I wrote my own utf-8 conversion functions, I remembered this lesson and had the length function decode each character, discarding the result, thus guaranteeing it would never interpret a string differently than the conversion code, even though this was ever so slightly slower.