1. 26
    1. 14

      It’s really obvious when you dig into AArch64 how data-driven the design was. There are a lot of nice things in there because compilers could make use of them. The value of these ‘CISCy’ instructions is more than the small code saving, they also avoid needing to allocate a rename register for the intermediate result. Register rename is one of the most power-hungry parts of a superscalar chip (it can easily become the bottleneck and you can’t turn it off while executing instructions) and removing a rename register from a sequence in a hot loop can often give a disproportionally large speed up. You could replace this instruction with an add and a conditional move but then you have one more live rename register, which reduces the size of the speculation window.