1. 10
  1.  

  2. 4

    This is the first article that has convinced me to use Python 3 by default. Thanks for spending the time to dig into this weird bug!

    1. 2

      Tricky bug, but I don’t understand this part

      I’m pretty sure you’ve all made this type of transformation once when receiving JSON, haven’t you?"

      I can’t think of a reason to read JSON and convert the string data to latin-1, except for interacting with some old interface that explicitly requires it.

      Was the author being facetious, or is there really some good reason to do this that I’m missing?

      1. 3

        The comment in the code-snippet says “Convert unicode strings into latin1 strings because of WSGI”. WSGI is Python’s Web Server Gateway Interface, a standard API between Python webapps and the HTTP servers that host them.

        Quoting from the WSGI specification:

        In general, HTTP deals with bytes, which means that this specification is mostly about handling bytes.

        However, the content of those bytes often has some kind of textual interpretation, and in Python, strings are the most convenient way to handle text.

        […]

        Do not be confused however: even if Python’s str type is actually Unicode “under the hood”, the content of native strings must still be translatable to bytes via the Latin-1 encoding!

        Exactly why you’d want to do that on received JSON, I don’t know, but there’s definitely some link between WSGI and Latin1.

      2. 2

        I feel like spawning threads on module import is probably a bad idea.

        There are interesting things you can do like this to the JVM too (or there used to be. My knowledge is very out of date and maybe this has changed). Loading a class is locked, so if you do fun things like spawning a thread in a static initializer you can very easily get your application to deadlock.

        ETA: This is still an interesting issue, mind you! And of course you can still hit the problem without spawning threads on import if your threaded code is importing things.