1. 33
    1. 18

      I found that when programming Rust I tend to use the type system to my advantage, rather than sprinkle runtime asserts everywhere.

      Looking at some examples from your blog post:

      For example, if I’m writing a warehouse stock system, parts of my program might assume properties such as “an item’s barcode is never empty”.

      In that case you can represent your barcode as Option<Barcode> where it can be empty, and use Barcode where it cannot. Then you assert as needed by expecting the barcode. Though you cannot make it into a debug_assert!, and rightly so - reading from None in release mode would result in reading from potentially uninitialized memory, which is undefined behavior.

      For example, I might have written a function which runs much faster if I assume the input integer is greater than 1.

      For this case even the stdlib has built-in types (which are similar but not exactly fitting this particular example) - std::num::NonZero{U,I}{8,16,32,64,size} - whose new functions return an Option<Self>. Likewise you could create your own type like GreaterThanOne<T> which provides the exact functionality you described.

      There are also some things that can’t really be checked with a runtime assertion - right now I’m writing a parser for a certain old game scripting language. I have a bunch of functions like Keyword::matches, which take in the keyword substring. A mistake I sometimes make however is that I pass in the parser’s entire input instead of just the keyword substring to the function, therefore Keyword::matches always returns false. This is not something you can reasonably check with a runtime assertion, but you could check against it at compilation, by defining a struct InputFragment(str); with a Deref<Target = &str>, which could only be produced by slicing the full input, and the type checker would rightly complain whenever you forget to slice the input before passing it into somewhere.

      And just know I’m not saying runtime assertions, and especially the presence of debug_assert! are completely useless 😄 I have used them before myself. It’s just that I tend to use them much less in the presence of Rust’s strong typing and explicit error handling culture.

      1. 11

        Yes, this is the way. Make illegal states unrepresentable. By leveraging Rust’s powerful sum-types, you can eliminate almost all input checking and assertions and the complexity that goes along with it. It is mind-blowing how many corner cases and error conditions simply cannot exist anymore and how much safer and denser the code is when this is done right.

      2. 1

        Sorry, not following your example:

        A mistake I sometimes make however is that I pass in the parser’s entire input instead of just the keyword substring to the function, therefore Keyword::matches always returns false. This is not something you can reasonably check with a runtime assertion, but you could check against it at compilation, by defining a struct InputFragment(str); with a Deref<Target = &str>, which could only be produced by slicing the full input, and the type checker would rightly complain whenever you forget to slice the input before passing it into somewhere.

        Is the full input a String so you use InputFragment to ensure you’re matching a slice?

        Just a thought, why not have a debug_assert that looks for a space? Wouldn’t this let you know that it’s just the keyword instead of the full input?

        I have a Vec in my code that contains u32 IDs which should always be in sorted order, and that I need to be in sorted order to perform a merge. I could simply call sort on both, but that’s a wasteful runtime cost. Instead I have a debug_assert that simply ensures each value is larger than the pervious (using the amazing iterators available). This is complied away, but ensures the lists are sorted in all my tests.

        1. 2

          Is the full input a String so you use InputFragment to ensure you’re matching a slice?

          Yeah, that’s the idea.

          Just a thought, why not have a debug_assert that looks for a space? Wouldn’t this let you know that it’s just the keyword instead of the full input?

          That’s a good idea, and definitely shorter than creating a newtype. However I still prefer my approach, because it prevents undesired failures at compilation, so I need to write less tests to cover those bad cases.

          Regarding your example, the way I would do it is something like:

          pub struct SortedIds(Vec<u32>);
          
          impl SortedIds {
              pub fn new(maybe_sorted: Vec<u32>) -> Result<Self, Vec<u32>> {
                  if is_sorted(maybe_sorted) { Ok(Self(maybe_sorted)) }
                  else { Err(maybe_sorted) }
              }
          
              pub unsafe fn new_unchecked(maybe_sorted: Vec<u32>) -> Self {
                  debug_assert!(is_sorted(maybe_sorted));
                  Self(maybe_sorted)
              }
          
              pub fn sort(unsorted: Vec<u32>) -> Self { /* ... */ }
          }
          

          So still using a debug_assert in the new_unchecked function to signal loudly that an invariant was broken, but also using a strong type so that we only need to verify the invariant at the edges of the API.

          Hope that clears it up!

    2. 11

      It’s not built into the language, but every C/C++ codebase I’ve worked with professionally has this distinction – I think of it as standard practice.

      At Google it’s DCHECK() vs CHECK()

      e.g. https://chromium.googlesource.com/chromium/src/+/main/base/check.h

      I guess the problem is that every C/C++ codebase has its own build system. There’s no concept of “debug” or “release” in the language, so it has to be added when you add your build system.


      One of my pet peeves with GNU Make is that Makefiles don’t have a good concept of debug vs. release builds (or ASAN / UBSAN) by default. Every open source project has to invent “build variants” for itself, which IMO should be a standard feature.

      You can change the flags that the .o files get generated with, but they’re put in the same place, not cached, and overwrite each other. But again every build system I’ve worked with professionally has a clear separation.

      1. 4

        Also, I learned the hard way that you need something like DCHECK() for SPACE overhead, not just time :-)

        For a long time in Oil, there was a symbol __PRETTY_FUNCTION__ showing up in our size profiles, taking up about 10% of the code space, e.g. more than 100 KB out of ~1.2 MB. (current profile: https://www.oilshell.org/release/0.14.2/pub/metrics.wwz/oils-for-unix/overview.txt )

        I didn’t know why it was there

        Turns out that assert() in GCC expands that to print the function name

        And the assert() was in our typed Alloc<T> function for garbage collection.

        Oil has hundreds (thousands?) of fine-grained static types for the AST, and they have templates List<T> and C++ namespaces, so that that single assert() took up hundreds of kilobytes of data space in the executable

        The solution was to change it do DCHECK(), so it only expands in debug builds, not release builds

        1. 2

          Ideally, release builds would include such instrumentation, but be keyed simply by instruction-pointer; then, given a crash report (or whatever), you should be able to recover all the original information (a la addr2line dance). This is annoying to get right in practice, though.

          1. 2

            It probably wasn’t clear, but the hundreds of kilobytes was simply constant strings of type names, added by assert(), e.g.

            syntax_asdl::Token
            List<syntax_asdl::Token>
            

            etc.

            So those shouldn’t be in the binary at all. It was just a side effect of using assert() in a templated function that has many instantiations.

            I guess nobody cares about a hundred KB these days, with many native toolchains creating 10MB and 100MB binaries, but I do :)

            1. 1

              Yes, I understand. I’m saying that, in production builds, assert should not be compiled out; it should be included, but it should not do anything but dump the instruction pointer. Then you (the author of the software) should be able to, given a production build and a failing instruction pointer, find out where the failure happened (but you may elide the information requisite to do that from the published binary).

              (Personally, I think it is a great travesty that any binary software is ever shipped without full debugging information, but that is a separate issue.)

              1. 1

                FreeBSD and Solaris kernels have some infrastructure that’s a bit like this for tracing. Things that are effectively printf arguments are captured in a ring buffer and then a userspace program (which can be on another machine doing post-mortem debugging) can provide the format strings that accompany the messages. You could take this a bit further for panics if you had compiler support for ensuring that the values you wanted to print were live somewhere and giving you sufficient debug info to recover them, but typically a crash dump loaded in a debugger is sufficient.

      2. 2

        LLVM has some macros that expand to an assertion in debug builds and an assumption in release builds: in releases you tell the compiler that a particular value or state is impossible to improve optimisation, in debug builds you check that the thing you’re assuming really is true.

        In CHERIoT I wrote our debugging infrastructure with two assert-like things:

        • Invariants are checked in all modes. In debug builds, they pretty-print failure messages using formatted output that can tell you what value was wrong. In release builds they just trap.
        • Assertions behave like invariants in debug builds but do nothing in release builds.

        This roughly maps to the Rust model, with one difference: they’re not keyed off a global debug flag, each subsystem has a separate build system option to enable them.

      3. 1

        The NDEBUG macro that the standard assert() function uses has been around for ages. It just doesn’t seem to be very well known, judging by these blog posts. I don’t think it was ever the intention that assertions would be left enabled in production code, at least not performance-sensitive code.

        1. 2

          NDEBUG exists, but it doesn’t help that much, because you still need 2 kinds of asserts in your source code

          For a long time in Oil I just used assert(), because I started the codebase from scratch. When performance started to be an issue, it’s wasn’t enough to disable asserts with NDEBUG, because some checks really do need to be on in release mode.

          You could turn those into if statements, but that warps the structure of the program IMO… I ended up with DCHECK() and CHECK(), which is like 3 lines of code, and works well.

    3. 5

      There’s also #[track_caller] annotation on functions that blames assert failures (panics) on the caller of the function.

      This is quite handy when you have functions that by contract make it caller’s responsibility to pass valid inputs, and don’t want to hear bug reports that your code crashed on a line that contains assert!(input == valid).

    4. 2

      D calls those assert (debugging, optimized out in release mode) and enforce (checking for domain errors, never optimized out).

    5. 1

      Go doesn’t have assert, just if cond { panic("") }. There have been proposals to add something like if cond { unsafe.Unreachable() } or special case if cond { panic("unreachable") }, so that the compiler can treat a condition as UB and optimize it away, but so far none of them have gone through because no one is confident enough to say “Oh, this check doesn’t need to run in production. We know it won’t ever happen even though the compiler can’t prove it.” I can see why you want debug_assert for code where performance is crucial, but at the same time, I can see why it’s probably good to not to tempt fate. :-)

      1. 1

        Note that rust’s debug_assert! doesn’t do if conf { unsafe.Unreachable() }, ever. In debug mode it panics, in release mode it is just a no-op. If the condition is false you don’t get undefined behaviour, you just execute whatever code would have been executed with the assert statement there.

        Rust does have std::hint::unreachable_unchecked, an unsafe function that causes undefiend behaviour if is executed. But none of the std assert macros use it.

    6. 1

      What it took me years to realise is that I have very different confidence levels about my assumptions in each category. I have high confidence that violations of my assumptions in the second category will be caught during normal testing. However, I have much lower confidence that violations of my assumptions in the first category will be caught during testing.

      This is very well put.