I’ve encountered a few variants of this. FreeBSD used to crash Gem5 because the superpage promotion algorithms doesn’t invalidate pages when installing a promoted mapping. On Intel CPUs, the new mapping would invalidate the old. On AMD, I believe it’s the same. On Centaur, it would detect the conflict and redo the page#table walk, so on all real CPUs it did the right thing and avoided an expensive operation (it doesn’t matter if a core sees the old TLB entry, it still points to the right physical page). We upstreamed a Gem5 fix.
As part of the REMS project, we booted FreeBSD on a formal model of MIPS. The first context switch relied on UB because the hi and lo registers[1] on mips are undefined (and, therefore, so are the move from hi and lo instructions) until after the first multiply. Reading these left an unspecified value in the register save frame. The first context switch happened before the first multiply. I think we fixed this by doing a 0*0 multiply early on in the boot.
[1] MIPS multiply instruction stores the result in two special registers. Before they added interlocks, you had to read them in the right order and with the right delay, but it’s quite a nice property on later architectures because you can get the low half of the result as soon as it’s computed and, if there’s another multiply in the instruction stream, the core can skip computing the top half (which, in the worst case, is a lot more expensive, though in the common case is almost always 0). More modern architecture have explicitly truncating multiply variants.
“The behavior has been 100% consistent, but we might want to change it in the future” is the one I would bet on at that time. i.e. most instructions do nothing at all with the condition bits, but they saw no need to define the state of the condition bits at any time other than right after a test instruction that sets them, and no reason to foreclose the possibility of an implementation later on where some instructions do trash those bits in the name of efficiency.
If you document that all instructions preserve the condition flags, then you’re not going to be able to take that back. If you don’t… well it depends on how many people end up relying on your undocumented behavior and how important they are. Eventually, it doesn’t matter if you wrote it down or not :)
I’ve encountered a few variants of this. FreeBSD used to crash Gem5 because the superpage promotion algorithms doesn’t invalidate pages when installing a promoted mapping. On Intel CPUs, the new mapping would invalidate the old. On AMD, I believe it’s the same. On Centaur, it would detect the conflict and redo the page#table walk, so on all real CPUs it did the right thing and avoided an expensive operation (it doesn’t matter if a core sees the old TLB entry, it still points to the right physical page). We upstreamed a Gem5 fix.
As part of the REMS project, we booted FreeBSD on a formal model of MIPS. The first context switch relied on UB because the hi and lo registers[1] on mips are undefined (and, therefore, so are the move from hi and lo instructions) until after the first multiply. Reading these left an unspecified value in the register save frame. The first context switch happened before the first multiply. I think we fixed this by doing a 0*0 multiply early on in the boot.
[1] MIPS multiply instruction stores the result in two special registers. Before they added interlocks, you had to read them in the right order and with the right delay, but it’s quite a nice property on later architectures because you can get the low half of the result as soon as it’s computed and, if there’s another multiply in the instruction stream, the core can skip computing the top half (which, in the worst case, is a lot more expensive, though in the common case is almost always 0). More modern architecture have explicitly truncating multiply variants.
“The behavior has been 100% consistent, but we might want to change it in the future” is the one I would bet on at that time. i.e. most instructions do nothing at all with the condition bits, but they saw no need to define the state of the condition bits at any time other than right after a test instruction that sets them, and no reason to foreclose the possibility of an implementation later on where some instructions do trash those bits in the name of efficiency.
If you document that all instructions preserve the condition flags, then you’re not going to be able to take that back. If you don’t… well it depends on how many people end up relying on your undocumented behavior and how important they are. Eventually, it doesn’t matter if you wrote it down or not :)