1. 2
  1.  

  2. 4

    I have some beef with this article’s treatment of pointers.

    It is well known that in C, all pointers can potentially be NULL, and this value is invalid.

    I’m not sure if dereferencing NULL is canonically invalid in C. It is certainly an ancient convention to use the address 0 to represent “nothing”. However, memory address 0 has not always been some magic singularity of doom.

    dereferencing a NULL pointer is not just ambiently a bad idea, it is something that will crash the program as hard as it can be crashed from the inside.

    Sigh. I keep having to repeat this for the benefit of the youngsters who never coded on old systems. It is true that on “real operating systems” dereferencing NULL produces an instant segfault … because the hardware has an MMU and the kernel deliberately unmaps page 0. That’s largely because of this NULL coding convention — it’s much, much easier to diagnose null-pointer bugs when they cause an immediate segfault!

    It is not true in all environments you’d run C on. It was infamously untrue on the ‘classic’ Mac OS (1984-2001), where there was indeed RAM at location 0. You could read and even write NULL without obvious problems. But if you wrote too far past NULL you’d clobber a bunch of interrupt vectors and crash hard.) There was an essential dev tool called “EvenBetterBusError” that wrote a bad address 0xDEADBEEF to location 0 60 times a second so that doubly-dereferencing would crash. EBBE also triggered its own segfault if it saw that the value at 0 had been modified.

    I believe there was some kind of system data located at location 0, because the classic MacOS wasn’t given to wasting memory. I don’t know what it was exactly, but if it existed then the system accessed it through reading or writing NULL.

    Even now there are a lot of embedded systems without MMUs that run C. Whether NULL is a valid address is up to whomever designed the hardware (whether that range is mapped to RAM or something else readable.) I’m pretty sure any sane system designer would make NULL unmapped, for the same debuggability reasons, but that’s a design decision and not some law of nature about NULL.

    when C pointers are used to implement objects, usually the C object system needs a pointer to an object to point to the vtable for the object, so a NULL pointer is invalid in the context of calling a method.

    That depends entirely on how objects are implemented. C++ uses vtables, but only for virtual methods. Nonvirtual methods are just regular function calls, and you can call them with a NULL receiver. I used to have some code that actually took advantage of this (calling methods on NULL returned default results, as in Obj-C) but at some point the Clang compiler and/or sanitizers started complaining about it, and I found it’s actually against the C++ spec.

    In Objective-C it is perfectly valid to call a method on NULL, and it’s done all the time. The method dispatcher (objc_msgsend) treats it as a special case and returns immediately with a result of 0 / NULL. This is actually really useful when chaining together method calls.

    As a result, I disagree with the author’s assertion that Go pointers are unlike C pointers. It’s not different that Go allows calling methods on nil; it’s just that Go methods (on non-interface types) are nonvirtual.

    Go pointers lack what I consider the distinctive characteristic of “pointers”, the ability to do pointer arithmetic.

    I don’t think that’s a necessary aspect of pointers. This is a very C-centric view! Pointer arithmetic is usually viewed as a horribly dangerous misfeature in most newer programming languages. Historically there have been languages with pointers that don’t have pointer arithmetic in this sense, such as Pascal. (The various extensions of Pascal that made it into a feasible systems programming language did add the ability to mess with pointers, but IIRC it wasn’t quite as simple as adding integers to pointers. I think it involved the ability to type-cast between pointer and integer types.)

    1. 1

      I’m not sure if dereferencing NULL is canonically invalid in C. It is certainly an ancient convention to use the address 0 to represent “nothing”. However, memory address 0 has not always been some magic singularity of doom.

      From C17 standard section 6.5.3.2 Address and indirection operators, heading Semantics:

      If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.¹⁰⁴⁾”

      Footnote 104:

      Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime.

      The C89 standard contains almost the same verbiage (3.3.3.2 Address and indirection operators, heading Semantics and footnote 34):

      If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.³⁴

      Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, or the address of an object that has automatic storage duration when execution of the block in which the object is declared and of all enclosed blocks has terminated.

      Interestingly C89 does not call out a pointer to a freed dynamically allocated object as being invalid.

      EDIT: Also, NULL does not necessarily have to be memory address 0.

      1. 1

        Thanks for the clarification. But “undefined behavior” is not the same thing as the OP’s “crashes the program”. It just means the C spec allows any and all behaviors at and after that point.

    2. 2

      The author is complaining that this Go code works:

      type SomeStruct struct{}
      func (ss *SomeStruct) Print() {
      	fmt.Println("hello!")
      }
      func main() {
      	ss := (*SomeStruct)(nil)
      	ss.Print()
      }
      

      How is it different from this C code, which also works?

      typedef struct{} SomeStruct;
      void print(SomeStruct *ss) {
        puts("hello!");
      }
      int main() {
        SomeStruct *ss = NULL;
        print(ss);
        return 0;
      }