1. 22
  1.  

  2. 15

    The author presents code equivalent to char* p = malloc(8); char* q = p + 4; *q; and argues that it’s not useful to null-check q — because if malloc() returns NULL, then q would be 0x4, not 0x0 (thus passing the null check).

    However, this is begging the question — q is only able to have an invalid but non-null value after neglecting to null-check p in the first place.

    I don’t find the article convincing. Just because memset() needs to be usable with any mapped address (which might include 0x0) doesn’t mean that we shouldn’t check for NULL ever.

    1. 10

      While his example of passing 0x4 might be bogus, the overall message isn’t. He’s not saying “never check for NULL” (because his sample function dup_to_upper() does check for NULL when it calls malloc()) but the immediate reflex to check the input pointer for NULL won’t really help when there are a large number of invalid addresses that aren’t NULL that could be passed in.

      The point he made was better made in the book Writing Solid Code. That book changed how I code C, and now, instead of:

      char *foo(char *s)
      {
        if (s == NULL)
          return NULL;
        ...
      }
      

      but instead:

      char *foo(char *s)
      {
        assert(s != NULL);
        ...
      }
      

      In my mind, unless I have a very good reason, passing in NULL to a function is a bug. Checking for NULL and returning an error is quite possibly hiding a real bug. Why are you passing NULL? How did that NULL get there in the first place? Are you not checking for NULL elsewhere?

      1. 2

        Out of interest, why assert() instead of using the ptr and allowing the SEGV signal handler and/or core give you similar info?

        1. 9

          Some reasons off the top of my head:

          • To catch the null pointer as soon as possible. Otherwise, it may be propagated further and cause a segfault much farther away, which is harder to debug. Much more so if a null/corrupt pointer is stored in some data structure instead of just passed to callees, since the stack trace is no longer useful in that case.
          • To document the expectations of the function to future developers.
          • You can redefine ASSERT() in debug builds to fail unit tests if they are running, which is more dev-friendly than crashing your test system.
          1. 1

            The “find the NULL as soon as possible” makes most sense to me. I guess I was thinking that using it (straight away) would provide this, but I agree we may do non-dangerous things (like storing it somewhere) before we deref it.

            Thank you.

            1. 1

              Beside future developers, they’re also useful for documenting expectations for static analysis tools. The Clang static analyzer, for example, takes them into account.

            2. 1

              Some systems are configured to allow mapping memory at NULL, which would open up a potential NULL ptr deref vulnerability wherein arbitrary data was stuffed at 0x0.

            3. 2

              The use of assert() here goes against sanitary program behaviour. Returning an error (while printing out useful information) is preferable (for example in Koios I used an enum for errors, null is represented as KERR_NULL) because the program can then act on that and save state, even if some of it might be non-useful, saving as much as is possible is just generally good behaviour. It also allows you to do more with that. Maybe you want to recheck assumptions and reload data, or some other kind of error-avoidance/compensation.

              How would you like it, as a user, if Firefox or Libreoffice, or other programs in which people tend to commonly work, just up and failed for (to the user) no observable reason?

              I don’t see how this is good guidance for anything but the simplest of C programs.

              edit: I forgot about NDEBUG.

              But that in effect makes the check useless for anything but (isolated) testing builds.

              Checking for NULL and returning an error is quite possibly hiding a real bug.

              I really don’t see how. Reporting an error and trying to compensate, either by soft-crashing after saving state, or using other / blacklisting data-sources, is not ‘hiding’ a bug. It’s the bare minimum any program should do to adapt to real world cases and scenarios.

              1. 4

                The use of assert() here goes against sanitary program behaviour. Returning an error (while printing out useful information) is preferable (for example in Koios I used an enum for errors, null is represented as KERR_NULL) because the program can then act on that and save state, even if some of it might be non-useful, saving as much as is possible is just generally good behaviour. It also allows you to do more with that. Maybe you want to recheck assumptions and reload data, or some other kind of error-avoidance/compensation.

                Assertions are intended for catching bugs not recoverable errors. An error that isn’t recoverable or was unexpected has occurred and it isn’t safe to continue program execution.

                How would you like it, as a user, if Firefox or Libreoffice, or other programs in which people tend to commonly work, just up and failed for (to the user) no observable reason?

                Many large systems including Firefox, the Linux kernel, LLVM, and others use a combination of assertions and error recovery.

                Assertion usage is one of the rules in NASA’s The Power of Ten – Rules for Developing Safety Critical Code.

                1. Rule: The assertion density of the code should average to a minimum of two assertions per function. Assertions are used to check for anomalous conditions that should never happen in real-life executions.
                1. 1

                  An error that isn’t recoverable or was unexpected has occurred and it isn’t safe to continue program execution.

                  A null pointer error doesn’t automagically invalidate all of the state that the user put into the program expecting to get it out again.

                  Many large systems including Firefox, the Linux kernel, LLVM, and others use a combination of assertions and error recovery.

                  Of course.

                  Assertion usage is one of the rules in NASA’s The Power of Ten – Rules for Developing Safety Critical Code. […]

                  Right, but that’s not the usage of the above null check, which advocates for never saving state on null, ever.

                2. 2

                  If the documentation says “this function requires a valid pointer”, why would I bother checking for NULL and returning an error? It’s a bug if NULL is passed in. The assert() is there just to make sure. I just checked, and when I pass NULL to memset() it crashed. I’m not going to blame memset() for this, as the semantics of the function don’t make sense if NULL is passed in for either of its functions.

                3. 2

                  I love using assert. It’s simple and concise. In a project I wrote to integrate with Pushover, I use assertions at the beginning of any exported function that takes pointers as arguments.

                  Sample code:

                  EXPORTED_SYM
                  bool
                  pushover_set_uri(pushover_ctx_t *ctx, const char *uri)
                  {
                  
                  	assert(ctx != NULL);
                  	assert(uri != NULL);
                  	assert(ctx->psh_uri == NULL);
                  
                  	ctx->psh_uri = strdup(uri);
                  	return (ctx->psh_uri != NULL);
                  }
                  

                  Also: I almost always use calloc over malloc, so that I know the allocation is in a known state. This also helps prevent infoleaks for structs, including compiler-introduced padding between fields. Using calloc does provide a little perf hit, but I prefer defensive coding techniques over performance.

                  1. 1

                    The one issue with using calloc() is that a NULL pointer on a system does not have to be all-zero (C standard, POSIX requires a NULL pointer to be all zeros). Yes, in source code, a literal 0 in a pointer context is NULL, but internally, it’s converted to whatever the system deems a NULL address.

                    I’ve used a custom malloc() that would fill memory with a particular value carefully selected per architecture. For the x86, it would fill the memory with 0xCC. As a pointer, it will probably crash. As an unsigned integer, it’s a sizable negative number. As an unsigned integer, it’s a large number. And if executed, it’s the INT3 instruction, aka, breakpoint. For the Motorola 68000 series, 0xA1 is a good choice—for all the above, plus it can cause a misaligned read access for 16 or 32 bit quantities if used as an address. I forgot what value I used for MIPS, but it was again, for the same reasons.

                4. 6

                  I just want to salute you for using “begging the question” to describe circular reasoning. I don’t often see that original meaning in the wild.

                  1. 2

                    Assuming NULL is 0x0 is wrong. Of course. The article’s entire argument falls down right there.

                    1. 2

                      On what architectures is NULL not 0x0? I can’t seem to think of any off-hand.

                      1. 3

                        Multics has it be -1, AFAIK. Probably not anything with non-integer pointers either like x86 segmentation.

                        1. 1

                          Here’s a good page on exactly that question: http://c-faq.com/null/machexamp.html

                    2. 7

                      The core argument seems to be:

                      The problem with the extra check is that it creates a contract in the API that the function cannot abide by. When it comes to documenting the new function the documentation will likely have a section about error codes, in which the author will say that the function returns NULL and sets errno to EINVAL if the argument is invalid. But, as we have just seen, that is only true for a very small subset of invalid arguments. [..] Not only does the function not fulfill its contract with the caller, it behaves differently for different values of invalid pointers: a NULL pointer causes it to return with an error, a non-canonical/unmapped/inaccessible pointer causes it to abort the process, and a semantically-invalid address may cause it to return a string with unexpected content.

                      Or, paraphrased more concisely, “it can’t detect all invalid input, therefore it shouldn’t check for it at all”.

                      I’m not really convinced by this; isn’t catching only some of the invalid input better than catching none of them? And the API contract issue seems solvable with a few extra words in the documentation.

                      That being said, I do think this post is pretty informative and useful in pointing out the limitations of such checks (and something I wasn’t really aware of as a casual/occasional C programmer).

                      1. 2

                        And the API contract issue seems solvable with a few extra words in the documentation.

                        I guess that would be phrased as “If you pass in NULL, we will return NULL and set errno”?

                        So then we have a null check inside the function to provide that part of the API, and callers must also NULL check the return of the function (since we’ve documented it may return NULL).

                        And we still have to be able to handle (and diagnse) program crashes due to other uses of invalid pointers, so….why not just handle crashes due to a NULL ptr in the same way? Then we don’t need to do the check inside the function and the caller doesn’t either.

                        (Hmmm - and I normally prefer error returns to exceptions. :-) )

                        1. 1

                          It depends a bit on the function but in general a soft and maskable error is worse than a hard failure. The example is a bit of a corner case, because it’s like strdup and every time I call strdup I do it via a wrapper that checks for NULL and returns a null pointer if the argument is NULL. The actual API that I want is ‘give me a pointer that has whatever lifetime I want and can be used in the same place as this char*’. In other functions; however, if I am passing NULL, then I have a logic error. If you then return NULL, I won’t be checking for that, because I don’t expect it, and so this handling will silently mask logic errors and may, at a higher level, either propagate null pointers or skip things like permission checks in ACLs. An assert(p != NULL); here is more useful: if I pass NULL, I get a hard failure in debug builds and can go and fix my code.

                          1. 1

                            I’m not really convinced by this; isn’t catching only some of the invalid input better than catching none of them?

                            Depends who you expect to call this function; API vs internal helper perhaps have different expectations. If the expectation is well described up-front in the documentation, I don’t think the NULL check is necessary as well. Let the caller worry about this. I don’t think C will ever have a good way of describing contracts (such as no non-null arguments) by design and most people who are writing C are probably trying to lower the number of unnecessary branches that are likely being checked by the caller already.

                          2. 6

                            I get it: Offensive programming is in essence what the author is arguing for.

                            Edit: No joke! It’s simply a case of non-defensive programming. Don’t focus on what it sounds like.

                            1. 4

                              Is this a variant of Klingon programming?

                              Defensive programming? Never! Klingon programs are always offensive. Yes, offensive programming is what we do best.

                              Perhaps it IS a good day to die! I say we ship it!

                              1. 2

                                Sounds like the adage adapted by python which says that it is better to beg forgiveness than to ask permission.

                                1. 1

                                  This adage is a lot older than Python[1]. And although it may be popular in the Python community, I doubt you’ll find many people outside of it who would associate it with Python. E.g. I associate it with reverse engineering proprietary drivers.

                                  [1]: https://quoteinvestigator.com/2018/06/19/forgive/

                                  1. 1

                                    I think they meant “adopted by Python”, i.e. an existing adage taken on by Python.

                              2. 6

                                C’s type system doesn’t have non-nullable pointers/references, so this is what you have to work with.

                                If you’re creating a library, you better check for everything you can. Otherwise the stack trace of a crash will point to you, and users will say your code is crashing.

                                1. 5

                                  The discussion so far assumed that NULL is always an invalid pointer. But is it? The C standard defines a NULL pointer as a pointer with the value 0

                                  The C standard also defines that pointer value to be an invalid value for dereferencing.

                                  The following is footnote 102 from the C11 standard. Being a footnote, it’s not normative, but it’s consistent with the body of the standard; the null pointer can’t be valid, because it doesn’t refer to any object:

                                  Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.

                                  (It doesn’t matter that the address 0 may be a legitimate address in the underlying architecture; in C, 0 is not a valid dereferencable pointer value).

                                  1. 1

                                    Interesting, how do I read/write the memory at 0x00000000 then? On embedded systems, zero can be the address for the reset vector or maybe some memory-mapped peripheral, but apparently the C standard cannot represent it.

                                    1. 3

                                      The quoted section from the article is actually wrong. C defines that the cast from an integer constant expression that evaluates to 0 to a pointer is NULL. Casting any other integer (including a run-time integer value that happens to be 0) to a pointer is, at best, implementation defined (it’s implementation defined if you do it via intptr_t, it’s unspecified if you do it any other way). There’s no requirement that NULL have a bit pattern of all zeroes and on some architectures the all-ones bit pattern is used because that’s more convenient and allows any address from zero to the top of memory to be used as a valid address.

                                      1. 1

                                        There’s no requirement that NULL have a bit pattern of all zeroes and on some architectures the all-ones bit pattern is used because that’s more convenient and allows any address from zero to the top of memory to be used as a valid address.

                                        Are any of these architectures currently manufactured? I looked into this a couple of years ago and all of the examples I found were at least 30 years old.

                                        1. 1

                                          Yes, at least one of the IBM mainframe architectures (Z series, I believe?). This came up when porting LLVM to that architecture. Interestingly, on AMD GPUs, address 0 is a valid (and frequently used) stack address. They would do much better by making an int-to-pointer cast subtract one and use the all-ones bit pattern to represent null, so casting (int)0 to a pointer would give an all-ones pointer value.

                                    2. 1

                                      I came here to say that, thanks :)

                                    3. 3

                                      I agree in theory with the notion of pushing work to your caller to keep your function obviously correct under simplified requirements. But in C you can’t express the difference between a pointer argument that may be null and one that may not. The compiler won’t help the caller do the right thing. So either your function is complex to write or complex to use. And in a library, you only have control over one side.

                                      I think I would follow this advice for a private function but do the opposite for a library export.

                                      1. 2

                                        Standard C can’t but most compilers have a _Nonnull or similar.

                                        1. 1

                                          I haven’t had much luck with __attribute__((nonnull)). It seems to warn only in the most obvious cases, like literally using NULL argument. _Nonnull in ObjC is only slightly less weak, but also very easy to accidentally cast away (IIRC variable definition just overrides nullability without needing any proof).

                                          I’d love to have some sound mechanism like Option that’s a compile-time guarantee, not merely a lint.

                                      2. 2

                                        This attitude is why everything is broken all the time.