1. 26
  1.  

  2. 6

    Don’t want overcommit? Turn it off.

    me@host$ head -n3 /proc/meminfo 
    MemTotal:       32815780 kB
    MemFree:        13309168 kB
    MemAvailable:   19461460 kB
    me@host$ cat /proc/sys/vm/overcommit_memory 
    0
    me@host$ python -c 'import os; s = (12 << 30) * "."; print s[:3]; os.fork()'
    ...
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    OSError: [Errno 12] Cannot allocate memory
    
    1. 1

      Don’t want overcommit? Turn it off.

      Just don’t do it in production, because sooner or later you’ll come to understand why it’s the default.

      1. 5

        Sure, and I agree that it’s an appropriate default for most common-case systems (though there are legitimate reasons to want to disable it). My intent was basically “learn your system’s configuration options instead of griping about its defaults”.

        1. 10

          The issue is fork makes no-overcommit unreasonable. With a spawn model you don’t see worst-case memory usage balloon every time you start a program.

    2. 6

      I suspect the real reason for fork() being like it is was that they were trying to keep things simple and fork() was simple and worked for them at the time. It didn’t stand the test of time as well as it might have though.

      1. 13

        According to Dennis Ritchie, that is in fact the reason: http://www.read.seas.harvard.edu/~kohler/class/aosref/ritchie84evolution.pdf

        I’m not really sure what the point of this article is. All of the author’s complaints about fork, and the alternative solutions that solve them, are well known for decades now, and are explained in the man pages and in practically every book that covers POSIX system calls.

        UNIX beat out the alternatives, but nobody’s ever claimed it’s perfect.

        1. 2

          Agreed. The author’s spawn() solution is exactly the approach taken in Mach (1985). You create a task, you set up threads in that task, then you tell the threads to start. fork() was supported as a special case by adding a flag to copy the parent’s memory to the child.

      2. 3

        There’s a cool flag that makes it so you don’t have to reap the process, too, which is nice because reaping children is another really stupid idea.

        I… is it? It doesn’t seem a wholly unreasonable way to arrange to get the exit status (or other termination details) of your child processes.

        1. 4

          Author here. I originally expanded on this in my first draft, but cut it out to balance complaints with solutions better. In my opinion, waiting on your children is fine if you can afford to block, and if not you have to set up SIGCHLD handlers, which is a non-trivial amount of code and involves signal handling, which is a mess in its own right and can easily be done incorrectly. Or you can use non-blocking waitpid, but that wasn’t a thing until recently. In all of these cases, if the parent doesn’t do its job well, your process table is littered with a bunch of annoying dead entries.

        2. 3

          I love a good rant, but this made me chuckle:

          Note: Linux offers this via the clone syscall now, but everyone just fork+execs anyway.

          Treating 1994 as “now” in certainly taking quite the long view. I have coworkers who weren’t born yet when clone was introduced.

          1. 2

            Let’s not forget the OpenVMS solution which was also spawn. That’s where Windows got it from among other things. When I looked it up, it took parameters about I/O, privileges, resource quotas, and so on. There was a misconception you had to set all that in the code whereas some OpenVMS developers said you could use defaults to do it with one command with a few arguments. The OS would also make sure things were happening in the right order and such instead of letting app figure it out. That’s why it spawned processes slowly as the downside. The upside was the system usually ran long-running, mission-critical processes that didn’t use the fork-on-every-event kind of programming. The safe-and-secure-by-default philosophy didn’t hurt it much.

            Their spawn method, esp with privileges and metering, reminded me a lot of what feature cloud providers were putting in Linux for their use cases. Might have saved themselves effort if an efficient version of spawn went in UNIX long ago instead of fork with the enhancements that would’ve happened to it over time.