1. 39
  1.  

  2. 29

    I think we are (slowly) fixing such problems in newer languages. The mindset changes from “if your program crashed you’re a bad programmer and you should feel bad” to “it’s the programming language’s fault that it hasn’t caught your bug”.

    For example, look at this fork wrapper:

    fn fork() -> Result<Fork, i32>
    

    You’ll get a warning if you ignore the return value, and a compile error if you try to use as a pid without handling the error somehow. Even if you want to handle the error by aborting the program, you still have to write that explicitly. And the Fork type is an enum Parent | Child(pid), so you have to match on it, and you can’t just run with let pid = fork().

    1. 6

      While I completely agree newer languages help here (specifically those with algebraic datatypes), it isn’t an end-all solution. In Rust, I’ve seen many people just bubble up errors like this with ? and it ends up acting like an exception (especially with libraries that automatically convert all errors into a universal type). This is 100x better than just failing with segfault, but it still doesn’t actually handle the problem.

      It really is just a cultural issue. So many developers believe solving the problem is just handling the happy path, when true engineering is when you handle both the happy path and the failure path. There is only so much a language can actually do here. If a developer believes “good” code looks only like the clean happy-path code then they are doomed to write terrible code.

      1. 12

        still doesn’t actually handle the problem

        Depending on the kind of program, just aborting is a perfectly valid way to handle an error, especially a “fork failed” kind of error — essentially the “your OS environment is kinda screwed up” condition.

        You definitely don’t want that in your spacecraft navigation controller (but that ideally shouldn’t have any dynamically-allocated resources at all), you would prefer to avoid it in a document editor with unsaved state (although users of those are conditioned to Ctrl-S early, Ctrl-S often), but who cares if a media player or system info monitor just gives up when your system is out of resources.

        The key problem with the old school C error handling is that a quick segfault in the right place is the good outcome. lldb -c program.core program, bt, “oh I see lol”. The far more insidious issue is that the program can go on as if nothing bad happened and start doing wildly incorrect things, and the article’s example of sending a signal to the -1 “pid” returned by the failed fork call is a perfect example.

        1. 2

          True, arguably for most applications simply failing with a relevant error message is appropriate. I think taking the article’s argument in general, and not specific to fork or segfaulting, properly handling errors is not something that is taught or even well regarded.

          1. 2

            This is the crux of the problem, IMO. That different applications need to handle errors differently AND that different kinds of errors within the same application need to be handled differently.

            That’s why, IMO, a general purpose language needs at least three failure mechanisms: one for “you used the API incorrectly or fed invalid data”, one for “something unexpected happened- let’s message the user and see what they want to do”, and one for “holy shit! There’s no way this will work any more. Just quit”.

            Rust is pretty damn close to right, IMO. My only complaint is that Rust (the std and some of the ecosystem) is still too eager to publish panicking APIs without at least offering a non-panicking version. In general, I rather have the non-panicking version than the panicking version because I can choose to panic on receiving the failure response. It’s harder to “unpanic” if you even realize that a panic might happen.

            Swift is kinda-sorta there. It has throws and it has Result. But it seems that nobody actually uses Result as a return value.

            Even in Java you were “supposed” to use checked exceptions for expected failure modes and unchecked for the cases where you need to just bubble all the way up or just crash.

            I’m super disappointed that Kotlin, for as ergonomic and awesome as the language is, reverted to just using unchecked exceptions for everything. That’s such a crappy way to do error handling/signalling.

        2. 3

          I want to add to your point. The kernel API matters too; the Linux fork syscall still takes a pid_t, which is a numeric type that indicates something like Parent | Child(u32 pid) | Error(u32 errno) when combined with C’s errno global error information. Programming language runtimes can only paper over the issue, as when this particular functionality is shaped like CreateProcess and doesn’t allow the fork/exec split.

          The very nature of fork is that parent processes are invited to confuse themselves as to who they are and with which privileges they are running. All a programming language can do in this situation is delay implementing the raw syscall at user level, and try to write enough tamed wrappers to keep up with demand.

          1. 4

            And this in turn is probably has a lot to do with the fact that the linux kernel is written in C (and thus suffers from C’s primitive type system), and POSIX dates back to a time when C and languages with similar type semantics to C were the only game in town when it came to writing an OS. If I was writing an OS from scatch today, I would do so in a language that did let me define the return type of the fork syscall with a sensible algebraic type. But that’s not C.

          2. 2

            It’s worth looking at how Erlang handles it too.

            1. 2

              In case you’re referring to fork specifically: I don’t think there’s any way to reasonably use fork in the Erlang VM. Generally multi-threaded VM runtimes don’t give you any API for fork, and if you just use FFI to call it from your program, prepare for unforeseen consequences.

              If this is about error handling in general: both exceptions and pattern matching, but of course dynamic pattern matching since it’s dynamically typed. i.e. there’s no Either type, just tuple conventions like {ok, <your result>} vs {error, <your error>} (where ok and error are atoms). And of course you can use try to turn exceptions into tuples to match, and match and throw to do the opposite.

              1. 2

                I wasn’t going specific on fork but, rather, what it enables. In Erlang you spawn processes. As far as I can tell, you are guaranteed to get a valid pid from spawn and there is a good story about what happens you use a pid for a destroyed process. It’s a case where a differently designed mechanism makes a class of errors impossible.

          3. 16

            All languages may be differently broken, but this is one class of problem which option types and completeness checking handle, so several languages are managing to roughly stomp this problem down, just as most modern languages manage to make buffer overflows and pointer arithmetic … “much harder to achieve”.

            It doesn’t need to be impossible to pull off vulnerable code, if we can just make sure that the default patterns and syntax generally reward non-vulnerable code much better.

            And yes, syntax; too many folks look at stuff like design patterns and think that’s a guideline for writing template code, rather than a guideline for concepts where if something is useful enough, it should migrate into the language with syntactical support, just as design patterns such as “function calls” and “while loops” have done so. We went through too long a period of complete stagnation in mainstream programming languages here.

            Fortunately, we seem to be coming out of that malaise.

            1. 6

              People are resistant to non-brokenness in languages.

              Sum types are starting to come into vogue a little bit; but dependent typing is still a bridge too far, and complete formal verification is right out (except for research and a few small niches).

            2. 5

              I think part of the reason is that when you write example code, be it online or in a book, people often leave out details like error handling to make it easier to grasp the information the author is trying to convey. The problem of course is that people will find these snippets and use them in production.

              As others have noted, some newer languages solve this with for instance option types that require checking for errors. I wonder if this could conceivably affect the clarity of example code.

              1. 3

                We hit this problem with malloc constantly, a problem which is compounded by systems that make checking the retval useless cos they only account for the memory usage when a page is touched. Drives me nuts how much we have to overcommit memory for these programs on systems that account for memory usage when it’s allocated.

                1. 4

                  Also, I argue with people who teach programming about this a lot. They omit error checking on things like slides “to save space”, and get annoyed when I say they’re showing a style of code that should never be used in the real world.

                  1. 2

                    For servers this actually make sense; the trick is to shut down on OOM instead of killing random processes. You have to handle random server shutdowns consistently anyway because hardware failures happen.

                    For desktop, you can turn off overcommit fairly easily. Only windows has a proper solution, though: there reserving and committing memory are separate operations, and failure of each can be handled separately.

                    1. 1

                      How does it make sense? You ask for memory and apparently get it, but when you try to actually use it, a random process or (as you’re suggesting) the whole box gets shot in the head? That’s some psycho stockholm syndrome bs right there.

                      You’re also arguing that a distributed system is a reasonable bandaid? Now I have two problems. Or maybe i need 2n+1 problems?

                      1. 1

                        arguing that a distributed system is a reasonable bandaid?

                        No. If you already have a distributed system then overcommit reduces the memory requirements of each host in your system. If your problem fits on one box, then by all means provision 64g of ram and turn off overcommit. But this is not the case for the majority of linux kernels in the world that have been deployed.

                  2. [Comment removed by author]

                    1. 1

                      What does this have to do with the article?

                      1. 3

                        Sorry, I accidentally posted it unter the wrong article.