1. 14
  1.  

  2. 3

    These are awesome news! UTF-8 should be used everywhere, as it literally has zero disadvantages against UTF-16 and UTF-32. The counterargument regarding the “library full of chinese text”, where UTF-16 is favored given it needs 16 instead of 24 bit for many asian surrogates, falls together when the consider that such a library would use some kind of markup language, which most likely has ASCII-identifiers and symbols.

    What kills UTF-16 for me is that it’s endian-dependent. Let’s move on and tackle higher-level problems of Unicode, namely grapheme cluster handling (which can consist of multiple codepoints) and normalization.