1. 30
  1.  

  2. 4

    While I can generate some better options than the author (unset pipefail in a subshell containing the pipeline for example) none of them actually overcome the objection that pipefail creates this footgun (in exchange for fixing the foot gun left by its absence).

    As the author says, it would be really nice to have a pipefail that can be parametrised on which signals if any caused exit, or on exit codes.

    1. 3

      (Edit: Oh, silly me, I realize now that SIGPIPE has a standard exit code. So… go ahead and ignore the first paragraph.)

      Having the shell track whether the failure was due to SIGPIPE unfortunately can’t generally work because it’s the generate program that has to handle it. One could imagine a protocol whereby programs respect some DONT_ERROR_ON_SIGPIPE environment variable or something, but that would be unreliable.

      For what it’s worth, my shell language Rash does have a way to specify for each pipeline whether to care about the exit codes of intermediate subprocesses. The default is to care for each subprocess that terminates by the time the last subprocess does. This, of course, leads to things like potential SIGPIPE races. However, each choice has its downsides – always caring about each subprocess means you have to wait for each to complete, which can cause some pipelines to stall while they wait for a program that was written to run forever until killed. Never caring is more obviously a bad idea. However, it also has a way for each pipeline stage to determine whether the exit code is successful (with a list, or with an arbitrary predicate function). You can define names (eg. aliases) to re-use that logic without rewriting it every time. But frankly handling the ad-hoc conventions for exit codes, termination conditions, etc, for each program you call from a shell is a bit of a mess no matter what tool you use.

    2. 4

      Also, it would be nice to only ignore SIGPIPE based failures, not other failures. If generate fails for other reasons, we’d like the whole pipeline to be seen as having failed

      OSH allows you do do this! The run builtin provides fine-grained control over exit codes.

      Compare

      yes | head -n 1   # yes will fail with SIGPIPE
      

      vs.

      run --status-ok SIGPIPE -- yes | head -n 1   # suppresses ONLY SIGPIPE failure
      

      (There is a much longer way to do it with Bourne shell, flipping options on and off, and case statements.)

      https://www.oilshell.org/release/0.8.5/test/spec.wwz/oil-language/oil-builtin-run.html

      As mentioned on the blog roadmap, I implemented all this errexit stuff back in October, but am bottlenecked on writing / documentation …

      1. 2

        The sed exits after a thousand lines and closes the pipe that generate is writing to, generate gets SIGPIPE and by default dies, and suddenly its exit status is non-zero, which means that with pipefail the entire pipeline ‘fails’

        Could this be better handled by generate? That is, if generate handled SIGPIPE and didn’t error, the whole pipeline would work as expected.

        By example, trapping SIGPIPE and exiting in Python seems to work:

        $ ./generate | head -n 3
        1
        2
        3
        $ echo $?
        0
        

        Whereas without the signal handler it errors:

        $ ./generate | head -n 3
        1
        2
        3
        Traceback (most recent call last):
          File "./generate", line 21, in <module>
            raise e
        IOError: [Errno 32] Broken pipe
        $ echo $?
        1
        
        1. 2

          Could this be better handled by generate? That is, if generate handled SIGPIPE and didn’t error, the whole pipeline would work as expected.

          Yes generally if your command is supposed to be able to be used with head or grep or whatever, you ought to exit 0 on SIGPIPE, to avoid exactly this issue. If that doesn’t occur, it’s almost certainly a bug and should be filed and fixed.

          1. 1

            Sure, but ./generate might not be under your control. You might be able to wrap it, of course.

          2. 2

            Let me ask an abstract question: which is the right pattern to handle “middle of the pipeline expectedly exits” situations in concurrent systems?

            One way to handle that is to error when sending data to the next filter, but it suffers two problems:

            • it breaks unidirectionality of the channels

            • it delays the cancellation – you get notified only during the next write, and not immediately. This might introduce unbounded delays:

              $ cat | (head -n 2 && echo done)
              hello
              hello
              world
              world
              done
              one hour later :-(
              $
              
            1. 1

              Not sure if this is what you mean, but when I use queues I usually use a sentinel value like None, which is basically the equivalent of close(fd). It says “nothing more will be written to this pipe/channel/queue”.

              That leads to the cleanest code usually. It’s all “straight line” code in the producer and consumer.

              Having a fatal SIGPIPE also works, but doing something other than dying is more difficult. I think the sentinel lets you handle it more cleanly. It’s unidirectional and synchronous.