1. 35

  2. 12

    It can be useful and intuitive to illustrate the paradox in a chart. For example: https://twitter.com/remiemonet/status/984893903605321729?s=20

    If we look at the population as a whole, the best fit line points in a certain direction. If we break down the same data into key groups, it’s clear that the best fit line for each group would point in a completely different direction. Our interpretation of the data can be different or even reversed, depending on whether we include group identification as a feature.

    1. 6

      That sounds like base rate fallacy, so now I’m doubting whether I understand either of them :)

      1. 6

        This one is also useful to visualize: https://byrdnick.com/wp-content/uploads/2020/07/Base-rate-box-problem-drug-test-diagnostic-reasoning-nick-byrd-512x446.jpeg

        In the base rate fallacy, the correct interpretation is a straight-forward application of Bayes’ rule, which turns out to be unintuitive for the lay-person. Essentially, even if you have a very accurate test or detector, when you have a very low “base rate” or true prevalence of the condition in the population, the results of your test can turn out to be uninformative.

        1. 3

          The Wikipedia article on Simpson’s paradox does a good job of illustrating it with examples. If anything, it’s a caution (beyond those which already exist) against reading too much into base rates.