I used to work for a web hosting company that deployed dozens of new servers every week. These were basically built from components with middle-tier consumer-level hardware, nothing at all enterprise about them so the quality control at the manufacturer was very likely not excellent. All of the boxes we would provision for a given hosting plan had identical hardware. Same motherboard, memory, CPUs, disks, and so on.
But I got to see first-hand how theoretically identical hardware could perform differently. Basically we would plug a box into the network, turn it on, and let it PXE boot into the OS installer. The install process took around 20 minutes. I say “around” because we would do two hosts at once and most of the time the two hosts would finish within seconds of each other but occasionally one would finish a minute or two ahead or behind the other.
The details of the setup escape me now because it was more than a decade ago at this point, but I’m sure I was able to rule out the PXE server and network as being a contributor to the variation somehow. I always meant to pursue it further to satisfy my own curiosity but I ended up leaving the company for a much much worse one (I would find out later) around that time.
You might find this interesting. There’s apparently a lot of causes to consider.
Thank you, I do find that interesting and I will plan to read it over lunch.
This is a great example of what I’ve been talking about in the hardware subversion threads. Variation is one of the negative effects of deep, sub-micron design. I mentioned them changing the operating characteristics because we, in our security assumptions, depend on the CPU to be operating as specified. We can’t now or we do in a highly probabilistic way. Hard to reason even about probabilities if I don’t know which components will fail. The 350nm+ nodes are still safe if one can take their poor specs.
re 2,386 in LLNL.
I’m glad he mentioned that. The first thing that popped into mind was, “Where the hell did they get 2,000+ Intel processors that were actually physical vs VM’s?” Looks like investments in traditional, supercomputing labs still paying off in new ways. :)
I’ve seen these plots before and I’ve been desperate to share them with my lab mates for quite some time. Thanks for sharing!