P.S. apologies for submitting this and then taking out a big chunk of the content afterwards. On reflection I thought the article would be stronger after taking out most of the opinion. The change is viewable here.
This is a useful perspective, but assumes you control the entire system and have decided to make architectural choices that optimize query performance and reliability.
The most valuable queries are the ones that unlock new data for the first time. there’s generally bigger knowledge gains available by adding new data than asking more sophisticated questions of your existing data. The problem is that the new data you want lives in systems you don’t control for political, security, financial, or legacy reasons.
We usually have to accept that the related data is split up (often terribly so) and design systems to minimize this cost.