1. 28
  1. 21

    Indeed, pthread implementations on other systems seem to be rather forgiving in comparison to OpenBSD’s. I would definitely recommend testing threaded programs on OpenBSD. Any unexpected behaviour is worth investigating. It might uncover bugs in the program.

    For example, I have fixed a bug in pulseaudio with help from Philip Guenther, which was caught by a different safeguard: A thread ID is only valid in the current process, and becomes invalid after fork().

    On OpenBSD, re-use of a thread ID after fork() caused ‘pulseaudio –start’ to hang forever., thus blocking Gnome’s GDM login prompt whenever I tried to log into GNOME again after logging out, which is how I noticed…

    https://cgit.freedesktop.org/pulseaudio/pulseaudio/commit/?id=d8e2b3a78c6c60ffec9a115a0e35a3cb4c68598f

    https://bugs.freedesktop.org/show_bug.cgi?id=71738

    1. 8

      Coincidentally, I just ran Apache APR’s test suite, and got this :)

      Failed Tests            Total   Fail    Failed %
      ===================================================
      testpass                    5      2     40.00%
      pthread_mutex_destroy on mutex with waiters!
      pthread_mutex_destroy on mutex with waiters!
      *** Error 1 in test (Makefile:212 'check')
      *** Error 1 in /home/stsp/src/apr-trunk (Makefile:159 'check')
      

      (edit: typo)

    2. 3

      Pretty cool, but the error message could be worded a little more clearly for the application developer.

      1. 3

        This may be due to my familiarity with pthreads and C++ mutexes, but this error honestly was lucid for me. Could you give me an idea what you wish the error would say?

        1. 7

          pthread_mutex_destroy on mutex with waiters!

          Random sampling of thoughts that a novice might think about this:

          • “What is pthread_mutex_destroy? I’m not calling that in my code.”
          • “What’s a ‘waiter’? It sounds like I’m in a restaurant.”
          • “That pthread thing is “on mutex”? What does it mean by “on mutex”? (I know it means “called on a mutex”, but it omits enough words to not only be unclear about this but to also be grammatically incorrect.)
          • “What do I even do about this message? Is it even a bad thing?”

          Here’s a sample error message that would get it across much more clearly:

          Warning: program terminated while a mutex was still locked.

          Or, if the error message isn’t limited to program termination:

          Warning: a mutex was destroyed while it was still locked.

          You could also add some informational text to tell the programmer what to look for:

          Please ensure that all mutexes are unlocked before they are destroyed.

          I know when you’re programming, and you’re deep in the context of a particular system like pthreads, it’s really easy to just throw an error message into stderr that relies heavily on the specific keywords for that context. But the people who’ll see that message won’t necessarily have that context. They could be seeing it somewhere else entirely, working on a far higher level system. It’s worth keeping that in mind.

          1. 8

            “What’s a ‘waiter’? It sounds like I’m in a restaurant.”

            That’s a reach. If you’re programming with mutexes, you know what “waiting on a mutex” is, and “waiters” is pretty obvious from that.

            “That pthread thing is “on mutex”? What does it mean by “on mutex”? (I know it means “called on a mutex”, but it omits enough words to not only be unclear about this but to also be grammatically incorrect.)

            I think error-message-ese, like headline-ese, is a distinct enough form of expression to have its own rules, given how much priority is given to brevity in error messages.

            “What do I even do about this message? Is it even a bad thing?”

            In the spirit of the above: In error-message-ese, an exclamation point means it’s a bad thing, or at least highly unusual.

          2. 3

            I find the specific wording a tiny bit confusing because in the specific example given in the article here, no thread is actually blocked waiting on the lock when it’s destroyed - one holds it, but none are blocked.

            Totally a nitpick though.