1. 13
  1.  

  2. 3

    I’m not sure I have the best feelings about where this article is going so far? The beginning seems to have a bunch of mix-ups between UTF-32 and UTF-8? Also the claim that Linux is fully UCS-4 is false, as is Linux not being locale-dependent.

    Perhaps the author has their own mental model of how unicode, translation formats and wide characters work, but the explanation here doesn’t lend me a ton of confidence.

    That said, I totally agree with the advice so far about creating a width barrier at the edge of your app, and ensuring that you are consistent internally. This makes it easier to port code to systems like Windows.

    1. 1

      Also the claim that Linux is fully UCS-4 is false, as is Linux not being locale-dependent.

      If I write GNU/Linux, will it make you feel better? It is a Glibc fact, and other C libraries on Linux AFAIK share this. To be locale-independent with wide characters (though perhaps Han unification still makes it lossy).

      I’m not aware of mix-ups, you’d have to point them out.