1. 16
  1.  

  2. 6

    I strongly recommend the Clang Undefined Behavior Sanitizer — it adds runtime checks to your code that detect and flag (among many other things) integer overflows. I always enable this in my dev/debug builds.

    Some of the weirder rules only apply to long-obsolete hardware. I think you hav to go back to the 1960s to find CPUs with 36- or 18-bit words, or that don’t use 2s-complement arithmetic, and to the 1970s to find ones whose memory isn’t byte-addressable (i.e. “char” is bigger than 1 byte.) And EBCDIC never made it out of IBM mainframes.

    Bu I guess if you ignore those factors you’ll get outraged bug reports from the retro-computing folks complaining that your code breaks on OS\360, TENEX or TOPS-20…

    1. 6

      Some of the weirder rules only apply to long-obsolete hardware. I think you hav to go back to the 1960s to find CPUs with 36- or 18-bit words, or that don’t use 2s-complement arithmetic, and to the 1970s to find ones whose memory isn’t byte-addressable (i.e. “char” is bigger than 1 byte.) And EBCDIC never made it out of IBM mainframes.

      You need to look no further than the SHARC, still one of the most popular lines of DSP lines at the moment, to find an architecture where sizeof(char) = 4.

      (Edit: FWIW, I think a better approach to these would be to make the standard stricter, even if that means more non-compliant implementations for “special” architectures like the SHARC. I’m pretty sure AD’s C compiler isn’t (or wasn’t, thankfully for my mental sanity I haven’t touched it in like 5 years) standards-compliant anyway because sizeof(int) is also 4. We’re at a stage where you expect inconsistencies and bugs in vendor-supplied compilers anyway. There’s no need for anyone else to suffer just so a bunch of vendors with really particular requirements to be able to claim compliance, when most of them don’t care about it anyway.)

      1. 5

        You need to look no further than the SHARC, still one of the most popular lines of DSP lines at the moment, to find an architecture where sizeof(char) = 4.

        Indeed. It’s the little, low-level, specialized devices where all the weirdness shows up. Basically, if it’s not a 32-bit device there is probably something the violates your expectations. Don’t assume the world is made up of the computers you’re used to using.

        1. 4

          Ok, mind blown! I did not know that. But I can see how a DSP platform wouldn’t see byte-addressibility as necessary.

          1. 1

            Wait, isn’t sizeof(char) = 1 by definition? I suspect that what you meant to say is that for the C implementation that runs on Analog Devices’ SHARC DSP, char is 32 bits wide, int is also 32 bits wide, and actually sizeof(int) = 1.

            1. 1

              You may be right, I don’t have the compiler at hand anymore. I’m sure (it caused me a lot of headaches when porting some code from elsewhere) that sizeof(char) and sizeof(int) were equal but I really don’t remember which of these two puzzling results it yielded.

          2. 3

            And EBCDIC never made it out of IBM mainframes.

            This is true, but there’s still a lot of code running (and being maintained!) on mainframes, so it can’t be ignored.

            1. 1

              Didn’t ICL mainframes also use EBCDIC?

          3. 2

            Incidentally, the _ExtInt proposal, if accepted, fixes promotion rules.


            when converting to a signed type […] if it doesn’t fit, then the behavior is implementation-defined, and could raise an exception (e.g. overflow trap).

            The behaviour is undefined, not implementation-defined, and therefore free to invoke eldritch nasal demons.

            Misconceptions […] Converting a pointer to an int and back to a pointer will be lossless.

            This is guaranteed for (u)intptr_t.

            Misconceptions […] Converting {a pointer to one integer type} to {a pointer to another integer type} is safe. (e.g. int *p = (...); long *q = (long*)p;.) (See type punning and strict aliasing.)

            This is permitted for char pointers.

            1. 1

              if the target type is signed, the behavior is implementation-defined (which may include raising a signal)

              https://en.cppreference.com/w/c/language/conversion#Integer_conversions

              If the destination type is signed, the value does not change if the source integer can be represented in the destination type. Otherwise the result is {implementation-defined (until C++20)} / {the unique value of the destination type equal to the source value modulo 2^n where n is the number of bits used to represent the destination type. (since C++20)}. (Note that this is different from signed integer arithmetic overflow, which is undefined).

              https://en.cppreference.com/w/cpp/language/implicit_conversion#Integral_conversions

              Speaking meta, I wrote the article because many people don’t understand the rules in C/C++.

              This is guaranteed for (u)intptr_t.

              I’m not sure about this one… There might be problems with fat function pointers?

              1. 2

                implementation-defined

                Ah, right, the clause I was thinking of has a different context (conversion of floats to ints). Though there is UB for signed shifts out of range (c11§6.5.7p2):

                [for E1 << E2] If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.


                I’m not sure about [(u)intptr_t]… There might be problems with fat function pointers?

                Ahhh, good point. intptr_t can round-trip any void pointer, which can in turn round-trip any object pointer, but a function is not an object.