1. 6
  1.  

  2. 4

    Question to author:

    I lately had to write the biggest shell script in my life at my work and that gave me few thoughts. Maybe you could point me somewhere if Oil will have something like it. Maybe bash or other shells support it, but I mainly write POSIX shell scripts.

    One thing is the job control. I think it is a pity that shell scripts don’t have access to proper job control. Linux has now prctl PR_SET_CHILD_SUBREAPER originating from DragonFlyBSD’s procctl PROC_REAP_ACQUIRE that FreeBSD also now has. The shell could expose this interface to properly clean-up all the child processes.

    I also would like to name jobs and be notified when they finish or fail. This should print fail:

    (set -e; (sleep 1; exit 1) & wait; echo success) || echo fail
    

    Also handling of SIGINT is a mess, I don’t now when it will stop whole script or break a loop or not work at all. IMHO by default it should stop the whole script. I assume it’s more a property of teletypewriters living inside of 21st century machines.

    The other thing is that it would be nice to have “promises” on shell variables. Example:

    PORT=$(server_on_random_port_between 8000 9000) &
    do_some_other_work
    client_connect $HOST $PORT # it should block here until there is a value under $PORT
    

    You could emulate this with named pipes. However named pipes are problematic, because you have to name them and keep them on a shared namespace - the file system. And unshare -m is not readily available for an ordinary user programs. It would be nicer to have proper names for fds. When shell creates a pipe for a process it could just as well keep it and allow user to put it in a variable (/dev/fd/N). Then all the hacks that require using fd 3 or 5 or some other uncommon number, could just use a proper handle that will not interfere with anything.

    I could then create a long running background process and keep it’s stdin and stdout pipes in a shell variable and provide them to other short running processes. I have very little idea how to make a nice syntax out of it. Maybe something like this:

    <&+ $SERV_IN long_running >&+ $SERV_OUT &
    < $SERV_OUT short_running_1 > $SERV_IN
    < $SERV_OUT short_running_2 > $SERV_IN
    
    1. 3

      As you indicated might be the case, I think much of what you describe in the latter part of your comment can be achieved with existing functionality in bash…

      $ slow_thing() { sleep 10; echo bork; }
      $ exec {fd}< <(slow_thing)
      # ...arbitrary other stuff here...
      $ read -u $fd foo # this will block until slow_thing outputs a line
      $ echo $foo
      bork
      $ exec {fd}<&- 
      

      And for the “long-running background process” case, bash offers the coproc built-in that does pretty much exactly what you describe (and has come in quite handy for me at times).

      1. 3

        Thanks, this is good feedback.

        1) One of my goals it to make a very tight interface to the kernel, including non-portable parts of Linux (and BSD if the expertise arrives to help)

        I mentioned cgroups and seccomp here: https://github.com/oilshell/oil/wiki/Project-Goals . So yes advanced prctl would fall under that. It’s still kind of far off, beyond breaking the Python dependency, but it’s something I want to do.

        2) You might have more success with { foo; bar; } vs (foo;bar). I noticed in the “Problems with $((” [1] post and in many real codebases that people use the latter because it’s easier to type (no spacing and semicolon problems).

        $ { sleep 0.1; exit 99; } & wait; echo "status $?"
        [1] 800
        [1]+  Exit 99                 { sleep 0.1; exit 99; }
        status 0
        

        Yes so I guess your problem is that “wait” eats the exit code? This appears to work:

        $ { sleep 0.1; exit 99; } & wait $!; echo "status $?"
        [1] 805
        [1]+  Exit 99                 { sleep 0.1; exit 99; }
        status 99
        

        But yeah it’s highly nonintuitive.

        I agree that naming both processes and pipes/ports is a good idea. I’ve played around with named pipes in shell scripts, but it’s always been confusing. It doesn’t quite work and most people don’t do it. So there is something to look at there.

        Somebody suggested similar things here: https://www.reddit.com/r/oilshell/comments/5whyac/csp_style_asynchronous_commands_issue_5_github/

        But the conversation didn’t go anywhere. What I will say is that it helps a lot of somebody suggests something concrete :) But most work is on OSH now and now Oil, so it’s not super urgent. But I’m keeping these things in the back of my mind.

        I like the analogy to promises.

        3) SIGINT – I will do what I can outside of the kernel, but some things are constrained by the Unix model. This will have to wait until I break the Python dependency too. Python does some of its own things with signals which I don’t want.

        [1] http://www.oilshell.org/blog/2016/11/18.html

        1. 1

          Thanks for the answer. I will go through the links later.

          1) That’s very nice to hear.

          2) The problem with this:

          $ { sleep 0.1; exit 99; } & wait $!; echo "status $?"
          

          is that it will work only for this artificial case. What about more real:

          $ {sleep 0.1; exit 99} & do_some_work; {sleep 2; exit 89 } & do_some_other_work; wait # ?
          

          That’s of course when the naming of processes comes in handy.

          3) Could you point me to something on this? I assume that I have to read The TTY demystified [1]

          [1] http://www.linusakesson.net/programming/tty/

          1. 2

            I think you can sort of do what you want by writing it out on separate lines and then capturing the PID, something like this:

            { sleep 0.1; exit 99} &
            pid1=$!
            
            do_some_work  # serial, status available
            status2=$?
            
            { sleep 0.1; exit 89} &
            pid3=$!
            
            do_some_work  # serial, status available
            status4=$?
            
            wait $pid1
            status1=$?  # async status
            
            wait $pid3
            status3=$?  # async status
            

            In this style:

            https://github.com/oilshell/oil/commit/2632ae5aa5c22602fae41adab922b195f38bd726

            I just learned there is “wait $pid” and “wait -n” which waits for the next process. (I tend to use xargs -P more for parallelism, it usually does the job.)

            A problem with this is that wait $pid blocks unnecessarily, and with “wait -n” is that you don’t know which process finished. There’s probably something to solve there. I just ran into a very similar issue in the oil codebase itself with $PIPESTATUS.

            3) I guess I would need a more specific problem with SIGINT. I usually refer to the Stevens' APUE book for all this kind of stuff (which I routinely forget).

            If you are only using one process, there should be no problem with sigint. Bash behaves like Python.

            If you call processes synchronously or asynchronously, I think they should be part of the process session and process Ctrl-C the same, so everything should work. The process session is for distributing signals I believe, but I would have to look it up again.

            The problem is that child processes can ALWAYS “escape”. They can do anything they want. They can change their session or process group.

            Also I think tmux and screen can interfere. I’m not surprised at all you’ve run into problems with Ctrl-C, but I would have to have something specific to know why it is or isn’t working. A shorter answer would probably be “interaction between applications, not all of which the shell can control”.

            A related issue is that in Unix is was IMPOSSIBLE to fork a process and clean up all children that the process started. You have to trust the other process to behave a certain way. On Linux you can clean up after UNtrusted processes with the cgroups mechanism.

            So yeah the shell can’t account for the behavior of other processes under the Unix model. If you think of it in adversarial terms, an adversary can always fuck you up, and they do on occasion.

            But if there is something specific in the shell, I’d like to fix it.

            1. 1

              Having all PIDs I could emulate “proper” wait for the price of those extra waits. Thanks, that is nice enough for now. But I probably should look more into xargs -P as you said later.

              I planned to do this shell script as a Makefile at first and then I would have parallelisation for “free”. But I have single background job that is required later. Also I would use another one if I would knew how to name fds (1amzave said how to do it in bash). I agree with you that shell should absorb make and awk.

              Regarding cgroups the problem is that unprivileged user can’t create them AFAIK. Maybe there could be a setuid program that would prepare a cgroup and only terminate and later kill all processes in it at the end of it’s only direct child. This direct child would do the rest.

              It’s funny that there are proposed systems that can spawn a vm for every single web request, but starting a process is considered slow. It may be, but I think there is a place for a system with faster process starting. That would help really isolate processes as they should be. Because what’s really a difference between a process and a VM?

              Chrome lately changed their JS pipeline. Now they are using Ignition - a fast startup interpreter and Turbofan - an optimizing compiler. It confirms for me that interpreted (shell) and compiled language (C etc.) combo is a very good model.

              1. 1

                Yeah as far as I can tell right now, if you want to run a bunch of things in parallel, and then wait until they’re all done, getting all exit codes, it should basically work, in either shell or xargs. It isn’t pretty, but it will work.

                If you want to do something more advanced, I think the limitation is that “wait -n” doesn’t tell you what process goes with which exit code.

                xargs also has problems getting the exit code. What I do is have xargs invoke a shell function that saves the exit code to a file, so the process tree looks like “sh -> xargs -> sh -> program to be parallelized”. That is what I do in the Oil spec test runner:

                https://github.com/oilshell/oil/blob/master/spec-runner.sh

                So yeah I want Oil to subsume the tools that start processes in paralllel: sh with &, xargs -P, and make -j. They really all belong together. (GNU parallel is mentioned in the GNU bash manual.)


                Yeah, cgroups and other kernel features need better user space tools/interfaces.

                I have created setuid programs to use advanced Linux kernel features. I think you have to do it that way. That is related to the Bernstein style[1], because you need separate and chained executables to follow the principle of least privilege with the setuid model.

                [1] http://www.oilshell.org/blog/2017/01/13.html

            2. 2

              Addendum: I don’t actually use & that much – I use xargs -P for batch and I use tmux for interactive.

              But now I’m looking at the bash manual for job control, and yeah it is super weird, like:

              { sleep 1; exit 99; } &
              
              %{   #  bring it into the foreground because it starts with {  .
                     # You can also use %1.
              

              So yeah I definitely agree explicit names are better there.

              And I agree about names for file descriptors. Bash does have it with cat {fd}<<EOF , but it’s fairly rare to see people use that.

              It’s pretty odd and should be unified with the normal variable binding mechanisms.

              A shell is a language of processes and file descriptors, so those things should be “first class” and nameable in a consistent way. I see a big design flaw there.