1. 16

  2. 8

    I know. I think my comment in an email the other day was: “daemonization is an anti-pattern, and that’s a hill I’m willing to die on.” I’m firmly in the service supervision camp.

    The program in question is the way it is for historical reasons. It has been around a long time, and it hasn’t been touched in some while. I just wanted to make a one-line change and get on with my day. I didn’t expect it to lead to two hours of frustration.

    1. 2

      Does this have anything to do with daemons? Wouldn’t the exact same (non-forked) program running behind a supervisor have the exact same behaviour? The real problem here is resource state tracking. Dying on the no-daemon hill will not solve it. Linear types could, for example.

    2. 3

      The fix was committed upstream in May of 2013, but it has not yet made its way to CentOS 7.


      1. 1

        yep, that’s the real kicker in this story.

      2. 2

        Another services should be designed to run in the foreground and let the service manager manage them, instead of trying to daemonise themselves.

        If you have to deploy to a production environment without a proper service manager, this is good to know, though!

        1. 7

          That depends on the service manager.

          This program has the same problem:

          import os
          import socket
          import time
          s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
          r, w = os.pipe()
          os.write(w, b'hi\n')
          s = None
          print(os.read(r, 4096))

          Is that example more clear?

          1. 3

            (Scott pilgrim voice) wow, that is eeeevil…

            Also, Yes, IMO this example is more clear. I’ve been bitten by the python daemon module before, but the issue of interaction of FD numbers and the GC is subtle.

            1. 2

              Thanks, added.

              1. 1

                I’m becoming less and less convinced GC is a good way to handle resources other than memory. It just seems to do a suboptimal thing at the wrong time (or maybe never).

                1. 4

                  Replacing os.close(s.fileno()) with s.close() in the above example also fixes the bug.

                  One could argue it should not be possible to fetch the file descriptor out of the socket if the socket is going to close the file descriptor itself when it’s garbage collected. Or maybe a callback has to be registered to handle when the file descriptor is externally closed.