Saw this pop up on Twitter. The website is awful – a true PhD-tries-to-webdev site.
Note that the article is hidden until you take the test. Just spam next-next-next until it shows you the article.
The observation here is that if you are the only resident in your zip code, you will be identified 100% of the time with (zipcode, birthdate, gender). In the real world, that percentage is usually >50%.
tl;dr: Individuals can be de-anonymized from sparse samples of large datasets, given relatively few demographic attributes. Not exactly news, but these folks have built a generative model to estimate just how bad it may be.
Here’s their Nature Communications paper for the nitty-gritty.