“Another class of databases query a single physical clock (a “clock oracle” as described in the Google Percolator paper) have an ambiguity window equivalent to the roundtrip internet latency to that shared clock, which is even worse, and suffer from an obvious single point of failure.” (my emphasis added)
Or they can use private lines (example) between clusters that are just far enough apart to mitigate most failures in one that would impact another. Common with VMS clusters and such working out fine. Some use two providers for redundancy. It also reduces amount of BS that hits the network through the line by bypassing the Internet. Maybe still better predictability and latency, too, but I don’t have current data on that.
Just always strange to me that many otherwise-good articles on synchronization act like private lines don’t exist or clusters have to be a country worth of distance apart from one another. Any company that can afford them between critical sites should consider doing so. The tougher tradeoffs for Internet lines then apply to other cases where they can’t afford them or otherwise choose Internet.