1. 16
  1. 7

    If it’s a CGI situation, maybe REMOTE_ADDRESS or whatever wasn’t defined for some reason. What then? Do we throw? Yuck!

    YES! It’s not like you weren’t in a CGI situation before – if the caller was expected to do something about this, they should’ve already done this, they have not, and so this is the last chance to tell them, otherwise you’re going to have people using your library in ways you do not anticipate and that are difficult to support.

    If you know how to check, rather than check in every call if you’re a CGI (this won’t change from one line to the next!) you can simply offer your users a is_cgi() function that returns the boolean they should be checking, and do whatever they want before ever getting to the point of looking up the remote address.


    I divide errors into three categories and use them to guide decision-making:

    • Domain Errors: These are your division by zeros, where your inputs are simply not something the function can meaningfully support. These should be represented by a value (such as null, or e.g. in a vec of floats 0/0.0); Option types can be in this category.
    • Stop Errors: These are situations where you simply cannot proceed so you might as well stop. Out of disk space errors, network failures, and pretty-much anything temporary is this kind of error because the best thing your process should do is stop and wait for the problem to go away.
    • Abort Errors: These are your dynamic-unwinds. You can’t proceed, and nothing is going to change that. You’re in this situation with REMOTE_ADDRESS and so you unwind. End of.

    This simple taxonomy has proven very robust for me, and it seems to work pretty well in user-training exercises, but the second-type (stop-errors) tend to be pretty controversial amongst experienced programmers. I don’t completely understand why – we use multitasking operating systems, and when we download a file too big for our disk, we (users) don’t want to wait until the disk is full, delete everything and generate an error (just potentially to try again), but we would like an alert reminding us it is full, to clean something up, and then resume downloading whatever it is we were doing – but nonetheless, most APIs do not take a “stop callback” for the application to generate the alert forcing library users to do (sometimes) excruciating things to provide a nice user experience.

    One thing that sometimes helps is referring to the “stop callback” as a “progress callback” – that can be called regularly to provide feedback to the user as to whether we are making progress or not. Sometimes this only includes a percentage (or something else) forcing the caller to compute the rate of progress and come up with their own heuristics, but some provide additional status information that would be useful in generating a good alert message.

    1. 2

      I have my own error categories that go like:

      1. programming errors, like EBADF (not an open file, or the operation couldn’t be done given how the file was open originally) or EINVAL (invalid parameter) that need to be fixed, but once fixed, never happens again;
      2. it can be fixed, like EACCESS (bad privileges) or ELOOP (too many symbolic links when trying to resolve a filename) but that the fix has to happen outside the scope of the program, but once fixed, tends not happen again unless someone made a mistake;
      3. better exit the program as quickly and cleanly as possible because something bad, like ENOMEM (insuffient kernel memory) just happened and things are going bad quickly. Depending upon the circumstances, a fast, hard crash might be the best thing to do;
      4. and finally, the small category of errors that a program might be able to handle, like ENOENT (file doesn’t exist) depending upon the context (it could then create the file, or ask the user for a different file, etc.).

      I can see we might have a similar list (you domain errors are my #4; your stop errors are my #2). Also, I was taught in college that if I have no idea how to handle an error, either don’t check for it, or abort immediately.

      1. 1

        I agree that “#3” in your list probably requires some special attention; The existence of the OOMK means we have to (for practical reasons) treat them like abort errors – dynamically unwinding all the way to _exit() if necessary, and so what I usually do is just raise(SIGABRT), but in my mind these are stop-errors that are just hard to handle – after all, memory pressure might go away so waiting (perhaps for a long time) may make the problem go away on its own – so I want to try and explain exactly why I take this approach:

        First, if the data is high-value, say because I am receiving it directly from the user (so if I crash here I lose record of the vent too), I will pre-allocate some space to use for notifications, and bind SIGABRT to a signal handler (on an alternate stack) that displays the notification and waits for the problem to go away.

        Secondly, because the experience degrades naturally into an abort-error-behaviour if the caller cannot (or won’t) do anything about it.

        So I think I can be convinced #3 is special even if I’m not sure it’s a separate “kind” of error; students writing libraries can be advised to start with raise(SIGABRT), and this can be easily hooked in the situations where another programmer (the library consumer) is in a position to deal with it.

        What you’ve classed as #1 I don’t think of as an “error” because it has nothing to do with the environment or the inputs; I think by conflating the mistakes the programmer made with the mistakes the user made creates a kind of animosity between programmer-and-user that should be avoided. My mistakes are mine, and not errors that the program should have to deal with.

        That being said, I think it’s perfectly fine to have ASSERTs in there to help the programmer validate their assumptions.

        Notably missing from your list are domain errors (inputs from the user). What do you think about them?

        Also, I was taught in college that if I have no idea how to handle an error, either don’t check for it, or abort immediately.

        This sounds weird to me, because there’s got to be something that helps you choose whether you need to check for it or abort immediately, almost like the design and development has hit its own kind of stop-error, so you need to figure it out. I suggest students abort if they don’t know – there’s some risk of getting this wrong, but it’s usually easier to upgrade an abort to a stop than to a sentinel/domain error (which requires letting your library consumer know they need to do something new).

        Am I missing something that you meant with this?

        1. 1

          #1 are technically errors (hence the error return) but they’re also bugs in the code and have to be fixed; normally they shouldn’t happen at all. I included this category because they can happen.

          I also use assert() often in my code.

          Notably missing from your list are domain errors (inputs from the user). What do you think about them?

          I don’t consider those errors, at least, not the type of errors I was talking about. It’s input from outside, and needs to be checked and verified. Personally, I don’t think user input should be considered as an exception—I tend to think of exceptions like CPU exceptions and dislike in general “software exceptions” (like in C++ or Java) but I do realize they are handy in some very limited cases (my opinion).

          Also, I was taught in college that if I have no idea how to handle an error, either don’t check for it, or abort immediately.

          This sounds weird to me, because there’s got to be something that helps you choose whether you need to check for it or abort immediately, almost like the design and development has hit its own kind of stop-error, so you need to figure it out.

          Back when I was in college, I was attempting to write a BBS game for a friend. I got to the point where I had detected some form of error and wanted to log it. I couldn’t just write an error to the screen since the operator might not see it, so let’s log the error to a file. But what if I couldn’t write to the file? Then what? Okay, how about writing it to the printer (this was back when printers were directly connected to PCs by a parallel port). But then the printer could be off-line or out of paper or something … then what? I can’t just crash the PC because that would take the BBS down. But I had to log the error somewhere, right?

          That was my dilemma, and I went to talk to one of my instructors (an IBM veteran of 30 years, worked on the early FORTRAN compilers) and he told me that advice. I was shocked as you are, but over the years, I’ve come to the conclusion that he was right. There are some cases that have no good way of being dealt with, so why check if you don’t know how to handle it?

          1. 1

            I don’t think user input should be considered as an exception—I tend to think of exceptions like CPU exceptions and dislike in general “software exceptions” (like in C++ or Java) but I do realize they are handy in some very limited cases (my opinion).

            I think exit(1) is the same as throw and vfork() is the same as try, in a way, but there are other ways of thinking about these things too. My point is that I think there’s a good reason to treat the mean of a null set to be all sorts of things, so it’s useful to communicate this as a value returned instead of as an exceptional condition of any kind.

            There are some cases that have no good way of being dealt with, so why check if you don’t know how to handle it?

            I enjoyed that story. That makes sense. Thank you.

    2. 5

      I’m doing (string | fail :domainNotFound) GetUserIp(), and I can confirm it’s annoying to handle. I think most of the time you just want the program to die there, but you need to now do special handling in every function up the call chain to propagate the error. Even if it’s as simple as string ip = GetUserIp()?;, you still need to at least adjust the return type.

      I think it’s one of those things where the upside is the same as the downside:

      • Pro: you can’t avoid thinking about error conditions
      • Con: you can’t avoid thinking about error conditions
      1. 3

        Ages ago, I wrote a deeper-going article.