1. 14
  1.  

  2. 2

    Yup… And bug trackers the world over are still littered with things like… https://bugs.ruby-lang.org/issues/8770

    1. 1

      Can you clarify what you mean here? I love this piece and refer to it frequently myself. I think it brilliantly illustrates why many things in tech are the way they are. I feel like you’re suggesting a point I’m missing.

      1. 8

        Yes, it does illustrate brilliantly why things in tech are the way they are.

        In fact, why many things in our economy are the way they are.

        Their solution to the PC losering problem is to transfer the complexity from the OS to the user.

        ie. Every author of a program, that invokes almost any system call, must on every invocation, remember to correctly handle, the possibility of it returning EINTR.

        Conversely, setting up a good test that proves that your code handles EINTR correctly, every time, is hard and you receive no help from the OS in doing so.

        Now if the effort that has now been deployed on literally tens of thousands of packages on fixing obscure sporadic EINTR related bugs….. had been expended on solving the PC losering problem correctly…

        We would all be much much better off.

        ie. The cost of solving the PC losering problem has been externalised to all users of syscalls. This enabled the “Worse is Better” solution to win in the market place by winning the “time to market” race.

        Alas, as is the case with very very many parts of our economic system, we reward via perverse incentives the “cheats” that externalize their costs…. but in the long run our entire civilization pays and pays and pays.

        1. 2

          Alas, as is the case with very very many parts of our economic system, we reward via perverse incentives the “cheats” that externalize their costs…. but in the long run our entire civilization pays and pays and pays.

          Is there any known solution to this problem that would not also sacrifice a lot of good things in the process?

          1. 2

            I think you will find any and every proposed solution will be condemned out of hand, and indeed fought tooth and nail, by those benefiting most from externalizing their costs.

            Thus caution is advised when listening to anyone saying “It won’t work”.

            The world seems to be (deliberately) stuck in this foolish black xor white thinking about economic systems, instead of more thoughtful and nuance debate.

            ie. I think the world needs it’s systems refactored, not rewritten.

            ie. We should be focusing on sinks of productivity and value, and tweaking the rules to reduce them.

    2. 1

      On the EINTR/retry issue, it’s not clear to me why a simple ‘return to userspace and retry the instruction’ is harder to implement.

      There is kernel-side state associated with the fd (seek position, network buffers etc) but these need to be maintained anyway for an EINTR return.

      Put another way - what work can the kernel avoid in the ‘return EINTR and let the userspace application call the system call again’ scenario which is required in the ‘return to userspace at PC-1’?

      1. 5

        There are two big issues: changing entry conditions, and blocking after interrupts.

        Blocking after interrupts is fairly obvious when you think about it. If you are blocking on some syscall in your main thread, receive a signal, and your syscall is restarted automatically, you cannot respond to that signal in your main thread at all. You’d just keep blocking. If the signal was SIGINT and you would want to cleanly shut down, you can’t. You’d be stuck until your syscall unblocks, which could never happen.

        Changing entry conditions are much more tricky. For example, some syscalls have timeouts—the amount of time they should block before returning regardless of their success or failure. If you start a syscall with a 10 second timeout, what happens if that syscall is cancelled and restarted 5 seconds in? If the same arguments are used, it would be as if the timeout had been 15 seconds. If you receive 1 signal per second indefinitely, the syscall will block forever. Unlikely but possible.

        When you call something with a 10 second timeout, you actually mean 10 seconds from now. To restart these syscalls, the kernel would need to preserve that start time entry condition. That’s fairly doable, but there are other entry conditions that aren’t doable at all. If you’re writing size-formatted output to the console, and you receive SIGWINCH, you don’t want to proceed with the write. Instead you need to reformat the output to match the new console dimensions. The kernel certainly can’t do that for you.

        There are so many reasons you might want to change syscall parameters after an interrupt, or do something else entirely. The kernel can’t know all of them. And designing an interface to conveniently accommodate all of them is a lot harder. Thus the worse-is-better solution: never assume what a program wants to do after an interrupt.

        1. 2

          Signals, I think, is now fairly well accepted as one of the “ugly” parts of posix.

          signalfd was an attempt at fixing it…. Here is more discussion of that..

          https://ldpreload.com/blog/signalfd-is-useless

          Having spend the last week battling the fine fine fine corner cases of signal handling….

          Sigh.

          I wish linux had something better.

          1. 1

            Thanks for this.

            If you are blocking on some syscall in your main thread, receive a signal, and your syscall is restarted automatically, you cannot respond to that signal in your main thread at all. You’d just keep blocking.

            I’m still not getting this one. I’d envisage:

            • application calls blocking read()
            • application receives SIGHUP
            • kernel sees incoming signal and stops doing read()
            • kernel calls up into user spaces to run signal handler in user context for SIGHUP. As far as application goes, it’s still doing read(), but signals can happen any time anyway, so no problem here?
            • kernel restarts read()

            If you’re writing size-formatted output to the console, and you receive SIGWINCH, you don’t want to proceed with the write.

            I’m not sure I agree. Given that arriving signals are inherently racy, I think it could be considered to also be correct to re-run the system call without the application making a new choice based on the new information. (The system call could easily have completed before the signal arrived - and the application should be prepared for that eventuality).

            When you call something with a 10 second timeout, you actually mean 10 seconds from now. To restart these syscalls, the kernel would need to preserve that start time entry condition.

            This is a good point. However, the optimist in me would like to think this is always solvable with API design. (In the timeout case, this would involve absolute timeout rather than relative).

            1. 2

              Your scenario doesn’t work, because signal handlers are extremely restricted. Signal handlers must be reentrant with respect to all other signal handlers, meaning they can’t allocate memory, can’t use most syscalls, can’t use any libc functions that set errno, and can’t non-atomically modify non-local variables.

              For this reason, signal handlers usually set a global volatile flag and return immediately, allowing the main thread to catch EINTR, check the flag, and handle the signal without restriction.

              In the SIGWINCH example, doing what you suggest causes significant visual tearing, even though it’s technically correct. But that assumption about racy signals only works if all syscalls are guaranteed to complete.

              However, the optimist in me would like to think this is always solvable with API design.

              Perhaps. Until such an API actually exists, we must write interruptible code somehow.

              (In the timeout case, this would involve absolute timeout rather than relative).

              What absolute time? The system clock can change. The clock that can’t is the monotonic raw clock, which simply ticks upwards from an unspecified time. I don’t think an API that takes a deadline with respect to a meaningless clock beats an API that requires restarting on interrupt.

              1. 1

                Your scenario doesn’t work, because signal handlers are extremely restricted.

                Yes they are, but I don’t think that stops the kernel from invoking them and then restarting the system call afterwards.

                In the SIGWINCH example, doing what you suggest causes significant visual tearing

                Which could already occur. We’re just (slightly) increasing the window in which delivery of the signal will cause it.

                What absolute time?

                The one with the same semantics as a relative timeout - i.e. monotonic. That is the behaviour you would get if you specified “10 seconds from now”.

                1. 2

                  Your scenario doesn’t work, because signal handlers are extremely restricted.

                  Yes they are, but I don’t think that stops the kernel from invoking them and then restarting the system call afterwards.

                  What I think you’re missing here is that since you can’t do much in a signal handler besides setting a flag for the main thread to check, you must have a way to interrupt/cancel/unblock whatever system call is blocking the main thread so that the main thread can start doing whatever the signal called for. If the kernel automatically restarts the system call, the main thread obviously can’t go check that flag and react accordingly.

                  1. 2

                    Yes, you’re right, thank you. But in that case, the interrupted return is the feature, not the bug?

                    The whole ‘worse is better’ thing is cast as “return EINTR” being an undesirable property.

                    1. 2

                      Yes indeed, it is a feature. But it is a feature that complicates life for every call that doesn’t care and doesn’t want to be interrupted.

                      1. 2

                        OK, but it isn’t anything to do with Worse is Better, right? The article characterises it as a deficiency of implementation.

                        In fact, it’s a feature you need if you want to abort a blocking operation.

                        afaics, the only problem is the default. You probably want the SA_RESTART behaviour by default (and perhaps that should be per-fd, rather than per-signal).

                        But I don’t think the characterisation in the article is fair, unless I’ve missed something.

                        1. 1

                          Worse is better means choosing a design that’s worse in some cases because at least it works for all cases. EINTR absolutely embodies that description. It’s a dead simple way to make sure you can always handle interrupts, but extremely tedious to work with in the common case.

                          1. 2

                            That was my understanding of the pt made in the article and the received wisdom regarding this article.

                            However, I thought the conclusion of the discussion above was that it wasn’t any easier than not handling the interrupt. In fact, the interrupted behaviour is required (it is a feature) in many cases.

                            I still fail to see how it is easier for the kernel to return EINTR rather than restart the syscall. (Apart from the API issue mentioned regarding entry conditions, e.g. relative/absolute times).

                            It might help if someone can outline the desired behaviour. It isn’t “restart syscall”, since that has been disposed of as undesirable above.

                            I think my point is: “this article says the Unix EINTR approach is a short cut in kernel implementation which has imposed a cost on userspace since then”. However:

                            a) I can’t see the shortcut being taken (no one has pointed out how it is easier to restart a read() than return EINTR) b) arguments above are for EINTR being a good thing

                            Is there a 3rd way (other than ‘abort read() early and return EINTR’ or ‘restart read()’) which I’m missing here?

                            1. 2

                              I’m totally out of my element here, but did not EINTR evolve over years? Like, it started really cheap and not useful, but it became better (and more complex) over time? Is this what the author referred to when they said that unix/c improved from 50% to 90%?

            2. 0

              Thus the worse-is-better solution: never assume what a program wants to do after an interrupt.

              One might argue that interrupts are a broken IPC system. Errno definitely is broken.

              However syscalls can fail (as any other computation).

          2. 1

            My question: has this been studied empirically from an economic or sociological perspective? “Worse is Better” is a very influential text in our industry, but I don’t want to take its claims as gospel until we’ve done the legwork.

            1. 1

              We do have piles of anecdotes supporting a half-ass, semi-working solution now gets more market share than a nearly-perfect one in same space later. Most businesses succeed that way. People also use many inconsistently-designed programs when they solve a problem they need solving.

              So, those two attributes have plenty of data to use. Past that, Im not sure what’s been studied.