1. 9
  1.  

  2. 4

    Hm, pedagogically, I’d say one nice solution is “don’t explain raw pointers” — in Rust they are a very niche thing. By the time when it’s reasonable to cover details of unsafe, it should be clear what a reference is.

    My preferred explanation for references is to:

    • say that they are pointers/addresses (a stack/heap picture here is mandatory to explain what a pointer is to folks unfamiliar with C)
    • motivate them by a print_vec function which would be inconvenient if it moved its argument
    • hand wave that compiler checks that, every time you use a reference, it isn’t stale

    I feel that focusing on reference/pointer differences can sidetrack students. In other languages, references have a “non-first class” connotation (eg, Java has pass-by-reference, but what reference is isn’t expressible in Java. In C++ there is syntax for references, but they are not fist-class (see, eg, std::reference_wrapper)). In Rust, “reference is just an ordinary value, just like an integer” is an important part of mental model.

    1. 2

      So I feel like this direction makes sense if the student is coming from the world of automatic memory management and glossing over the details until later might make sense… I’m uncertain about those coming from C/asm/etc… who know more about pointers in other languages and are likely prone to making assumptions about behaviors of low level pointers.

      1. 2

        I have taught Rust to quite a few people professionally and I have to say that I believed that for a while, but pretty much walked back from it. Rust is Ownership and Borrowing. Raw pointers fit that model quite well, as owned values that represent a very specific case (they point to memory).

        However, I do not teach references as pointers or their representation at all - I teach them as references, very much with the ruleset that the compiler enforces early on. I do not teach what the compiler checks, but what the language guarantees - lifetime tracking follows kind of naturally from that later.

        I believe it’s a didactic mistake to open up the field for learning inference issues by giving too much thought on how people would map things - consciously avoiding mapping, but answering questions whenever they arise is key.

        1. 1

          This seems like an easy case, as “a reference is a pointer, but the compiler always enforces that the pointed to object is valid at the point of dereference” is pretty much all you need here? Brining attention to the difference (rather than similarity) of references and raw pointers for this category of students risks two fallacies:

          • thinking you somehow need raw pointers in usual Rust programming (you don’t)
          • thinking that references are somehow magical entities a-la C++ (they aren’t)
          1. 3

            the pointed to object is valid at the point of dereference

            Small nitpick, but the pointed to object must be valid at all times, even when not dereferenced:

            https://doc.rust-lang.org/std/primitive.reference.html

            creating a &bool that points to an allocation containing the value 3 causes undefined behaviour

            UCG notes that references must be “dereferenceable” meaning the compiler is allowed to deref them at any point (i.e. to facilitate optimizations). This and the lack of lifetimes seems to be the main differentiator with a raw pointer.

            https://github.com/rust-lang/rfcs/blob/master/text/2582-raw-reference-mir-operator.md#motivation

            In particular, references must be aligned and dereferenceable, even when they are created and never used.


            Tangent time, but this invariant made it impossible to implement offsetof soundly at it required finding the field address of a stub/uninit object (generic impl can’t know to how initialize all objects):

            it is also currently not possible to create a raw pointer to a field of an uninitialized struct: again, &mut uninit.field as *mut _ would create an intermediate reference to uninitialized data.

            So they added special/nightly &raw references which don’t have to point to valid data. This is used internally to implement core::ptr::addr_of! which lets you soundly do offsetof things now.

            &raw references are nightly and unstable though. So when writing concurrent data structures which have to use unsafe internally, one could hit a footgun of accidentally introducing a reference for a split second even if the thread could be preempted and have underlying object deallocated. This is technically a form of UB that exists in Rust which doesn’t exist in C (amongst others reference issues like transitive &mut aliasing of self-referential types like Future).

            1. 1

              Heh, I was trying to pick at a different nit with this wording :) with NLL, a reference actually can outlive referenced object, as the borrow checker specifically looks for usages of the reference: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=379dc7af5806d1d717f0a931f17d9917.

              But I am actually not sure how to square the two observations! In https://github.com/rust-lang/rust/pull/98017/files#r904256792 Ralf mentions live reference, so it seems like liveness should play a role here…

              1. 1

                I think “lifetime of a reference” there may mean “until the reference is dropped” as references are also objects