1. 15
  1. 9

    While I can’t tell you precisely why this happens, I can tell you when. (Assume t.c contains the code in your post.)

    arm-none-eabi-gcc -Os -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 \
       -mfloat-abi=hard -mthumb \
       -fdump-tree-copyprop2 -fdump-tree-isolate-paths \
       -S -o t.s t.c
    

    You should get two files of extra output: t.c.121t.copyprop2 and t.c.122t.isolate-paths. The relevant content is the main function.

    t.c.121t.copyprop2

    main ()
    {
      uint32_t c;
    
      <bb 2> [local count: 1073741824]:
      c_3 = 0 / 0;
      printf ("c: %u\n", c_3);
      __builtin_puts (&"hello world"[0]);
      return 0;
    }
    

    t.c.122t.isolate-paths

    main ()
    {
      uint32_t c;
    
      <bb 2> [local count: 1073741824]:
      __builtin_trap ();
    }
    

    So this is happening in the isolate-paths pass. The comment at the start of this pass provides some relevant information.

    /* Search the function for statements which, if executed, would cause
       the program to fault such as a dereference of a NULL pointer.
    
       Such a program can't be valid if such a statement was to execute
       according to ISO standards.
    
       We detect explicit NULL pointer dereferences as well as those implied
       by a PHI argument having a NULL value which unconditionally flows into
       a dereference in the same block as the PHI.
    
       In the former case we replace the offending statement with an
       unconditional trap and eliminate the outgoing edges from the statement's
       basic block.  This may expose secondary optimization opportunities.
    
       In the latter case, we isolate the path(s) with the NULL PHI
       feeding the dereference.  We can then replace the offending statement
       and eliminate the outgoing edges in the duplicate.  Again, this may
       expose secondary optimization opportunities.
    
       A warning for both cases may be advisable as well.
    
       Other statically detectable violations of the ISO standard could be
       handled in a similar way, such as out-of-bounds array indexing.  */
    

    There is acknowledgement that a warning may be a good idea, but there isn’t one, for whatever reason that is unclear.

    1. 8

      I believe gcc for x86_64 does a similar thing when it detects a null pointer dereference. Ah the wonders of undefined behavior I guess.

      1. 4

        Wouldn’t you get a warning? If the compiler was able to prove a zero-divide, it ought to spit out a warning at the same time.

        1. 15

          Not usually, for two reasons.

          First, the compiler is not a monolith. Warnings are generated in the front end. This knowledge is typically available only after you’ve done a load of optimisations. At that point, you probably don’t have sufficient source information left to be able to provide a useful warning. It may be detectable division by zero only after a load of inlining, constant propagation, arithmetic reassociation, and so on. The division may be in one function, the place it ends up in another, and the information required to prove that the divisor in another.

          Second, the optimisation that generates this may generate the invalid instruction in multiple steps. The replacement of division by zero is likely to be replaced by a trap in a single step but by that point the optimiser doesn’t know that the division was present in the source (it may have been introduced by another transform) or if it will end up in the output (it may be dead code that will eventually be eliminated). The last of these is most relevant because compilers quite often make use of this kind of thing to find dead code. In theory at least, a valid C program may not contain undefined behaviour. If something would cause undefined behaviour, then it should not be possible and so can be eliminated. It’s therefore fairly common to see things that might be undefined behaviour in the middle of an optimisation pipeline, but most of them don’t end up in the final output.

          1. 5

            I do get warnings of this nature in Xcode — that rely on a lot of code-flow analysis — but they come from the Clang static analyzer, not the compiler itself. Xcode makes it easy to run both in parallel during a build, so I kind of forget I’m not just running “plain” Clang.

            1. 2

              Excellent answer, but at that point it’s a user interface problem. The compiler totally knows the divide-by-zero happens, but doesn’t quite have the information to explain exactly how it got to that point. Still, it seems like it’s still totally possible for a compiler to say “Hey, pro tip, this particular bit of the output code will Do Something Impossible on some inputs, you might want to check that out”, even if it still generates the same actual code.

              1. 8

                That addresses the first problem but not the second. Consider this (massively simplified) example:

                int x(int b, int c)
                {
                  return b / c;
                }
                
                int y(int b)
                {
                  if (b)
                  {
                    return 1;
                  }
                  return b;
                }
                
                int z(int a)
                {
                  int b = 0;
                  if (y(a))
                  {
                    return x(a, b);
                  }
                  return 1;
                }
                

                First thing you do is inline x, so you end up with:

                int y(int b)
                {
                  if (b)
                  {
                    return 1;
                  }
                  return b;
                }
                
                int z(int a)
                {
                  int b = 0;
                  if (y(a))
                  {
                    return a / b;
                  }
                  return 1;
                }
                

                Now you do some constant propagation:

                int y(int b)
                {
                  if (b)
                  {
                    return 1;
                  }
                  return b;
                }
                
                int z(int a)
                {
                  if (y(a))
                  {
                    return a / 0;
                  }
                  return 1;
                }
                

                Now you have a division by zero. Do you raise a warning? Let’s see what happens if you don’t. First, you transform it into a trap, because it’s definitely UB:

                int y(int b)
                {
                  if (b)
                  {
                    return 1;
                  }
                  return b;
                }
                
                int z(int a)
                {
                  if (y(a))
                  {
                    __trap();
                  }
                  return 1;
                }
                

                Now you inline y:

                int z(int a)
                {
                  if (a ? 0 : a)
                  {
                    __trap();
                  }
                  return 1;
                }
                

                Now you run constant propagation again:

                int z(int a)
                {
                  if (0)
                  {
                    __trap();
                  }
                  return 1;
                }
                

                Now you simplify the CFG, and you get:

                int z(int)
                {
                  return 1;
                }
                

                At the end of your optimisation, multiple steps after you thought you’d found a division by zero, you discover that it’s not there. In theory, you could keep around the information about the reason the trap was introduced in the first place and then propagate that up to the user if a trap survives optimisation, but that has two problems:

                • Just because it’s in the code, doesn’t mean that it’s actually reachable, it just means that the compiler can’t prove it’s unreachable. You’ll get a bunch of false positives. Imagine the above example where every function is from a separate compilation unit. Now you’d get the warning in a normal build but not with LTO. Not a great user experience.
                • The amount of extra information that you’d need to carry through the optimisation pipeline is huge. Conservatively, this would double the size of LLVM IR. Would you be happy if clang memory usage doubled?

                So, yes, it’s ‘just a UI problem’ but in the same way that my shell not recognising natural language instructions is ‘just a UI problem’.

                If you think this is a contrived example, I suggest that you compile some non-trivial C++ code and use clang’s -mllvm -print-after-all flags to see the IR after each optimisation step. You’ll see things like this in intermediate steps from C++ template specialisations all of the time. They’re a bit less frequent with modern C++ where constexpr if statements can trim some of the paths early, but they’re still pretty common. Turning them into false-positive warnings would be a terrible UI.

                1. 2

                  A compiler option to make __trap() an error during code generation (that is, after all the wrong candidates are optimized out) instead of an undefined opcode would be useful: Even if it’s in a code path that is never supposed to be executed, if you’re ending up with it in actual code and the optimizer couldn’t get rid of it, the code probably benefits from some massaging.

                  (And if you want to keep __trap() functional for manual use, have the optimizer use an internal symbol, __generated_trap or whatever, which either leads to an error or is rewritten to __trap in a final pass to then become undefined opcode)

                  1. 1

                    It’s pretty trivial to do this without a compiler flag: just grep your output for ud2, or whatever trap is lowered to on a given target. As with a compiler flag (that can be implemented without a complete redesign of the compiler), it will tell you where the traps are but not why.

                    1. 2

                      “just grep your output for ud2, or whatever trap is lowered to on a given target” isn’t “trivial” when “whatever trap is lowered to” can change in any compiler update without notice (because such details aren’t documented for mere mortals), while compiler updates are the most critical time where one might want to know about such issues (because the optimizer is trying all-new tricks), when people might want to support a larger number of architectures (and then have to check for all those possible traps), or when they might even have deliberate, manual uses of ud2-or-whatever-a-trap-compiles-to in some places.

                      It’s what I tried before I ended up with https://review.coreboot.org/c/coreboot/+/14364, and the time spent on that patch was well worth my while (even though I had to dive into gcc, yuck) because “just grep for ud2” was a mess.

                      Having a dedicated symbol for that purpose that can be intercepted would be a huge help to track down issues (even when there’s no description of the crack the optimizer smoked before creating the trap) but I guess exit will do for now.

            2. 5

              For that purpose, I have a compiler patch that changes such situations from __builtin_trap (which is compiled to the undefined opcode eventually) to exit. In the situation I have to deal with this (coreboot), we don’t have exit, so it becomes a link time error. These are rather simple to pinpoint with objdump -dS.

              That’s the best I have been able/willing to build so far (without digging into the compiler internals too much) but it helped me a couple of times as in firmware space, those “undefined” opcodes are a non-descript hang, not a segfault or SIGILL that you can intercept with a debugger to see where and why they happen.

              1. 2

                This is an interesting hack. I hadn’t planned on building my own toolchain from scratch but now I’m curious to know what other surprises are left for me in my project. Thanks!

              2. 3

                It will often be impossible to tell at compile time whether that code path will ever be executed – this requires solving the “Halting Problem”.

                1. 2

                  You would think so, right? Ignoring the div/0 problem, it certainly seems that if the compiler decides to emit illegal instructions that a warning (or even error) might be warranted.

                  But in answer to your question, I was unable to find any command line option that would trigger a warning.

                  1. 3

                    Try running the Clang static analyzer. I’d forgotten that Xcode runs it when I build, and it’s what produces that type of warning.

                    1. 2

                      That’s a solid suggestion and I’ll definitely check it out. What is kind of missing from my original post is that this example came from a 3rd party library for a sensor. The compiler generated code for that did not have any illegal instructions.

                      The illegal instruction came after I came along looking at a crash and thought, hmm, computed divisor, wonder if it’s zero and I added the “GOTCHA” logic. And then after I just happened to notice that the processor status register bits were not set as one might expect.

                      Long way of saying that I’m not sure that using the analyzer would have prevented this journey of discovery.