1. 4

  2. 4

    Critically, it means the Bayesian–subjectivist can make decisions as if the same hypothesis – based on the same evidence – is both true and false, depending on the reward structure of the decision ahead of them. In competition for limited resources, the Bayesian–subjectivist is going to crush the frequentist–objectivist.

    At this point I’ve read dozens of Bayesian articles trash-talking frequentists. I have not once met a self-identified frequentist. I don’t believe they’re real. If they are, I have no idea what they look like. It’s like I’m reading articles about how bigfoot is an antivaxxer.

    under the frequentism–objectivism school, we cannot make decisions within the framework of probability. We have to step out of this framework to make any decisions.

    But didn’t you have to step out of the framework with “bayesian-subjectivism” school too? To think “there’s a 77 % chance variant A is actually better than B”, you had to have picked a prior beforehand.

    1. 2

      In the spectrum of things, I’d say I identify as a frequentist. I find all these articles pretty tiresome, though. The connections are much stronger than articles like this acknowledge…

      Generally, I find Bayesian stuff often does a poor job characterizing or thinking about important things like sampling error or data-driven analysis decisions (though of course it is possible to handle those things).

      Also, I’d challenge one to come up with a useful interpretation of specific probability values without eventually appealing to some frequentist scenario like betting…

      1. 2

        When you say you’re a frequentist, do you mean you won’t use bayesian methods? Most of the people I know are either capital-B Bayesians who think that they should never use frequentist methods, or “eh I use both whatever works” people.

        1. 3

          Well, yeah… whatever works. I’d say I’m frequentist in that I believe frequentist methods are valid and useful, which seems to be the point of contention of the never-ending pro-Bayesian posts?

        2. 2

          It’s unlikely that you’re actually a frequentist. Try this quiz to find out https://www.youtube.com/watch?v=GEFxFVESQXc

          1. 1

            What’s the probability given that I’ve never seen a Black Swan?

        3. 1

          I think very few people identify as frequentists for two reasons:

          (1) frequentist hypothesis testing is the norm in the statistics. It’s less common for people who belong to the majority group to actively identify with it because most people will just assume that they belong to it. For example, if you read a paper that reports results of a t-test (or whatever else), you can safely assume that they did it within a frequentist framework (i.e., null hypothesis, alternative hypothesis, reject/not reject the null hypothesis)

          (2) most people think about probability in a Bayesian way regardless of whether their statistical framework is frequentist or Bayesian. That is, people think about probability as a likelihood of something being true (e.g., a p-value of 0.01 means that they are 99% sure that the null hypothesis is false). This is how probabilities should be interpreted within a Bayesian framework. This is also the wrong interpretation of probabilities in a frequentist framework. In this case, the interpretation relies on sampling from a population and the frequency of samples with the statistic equal or more extreme to the one observed under the null hypothesis.

          The trade-off between Bayesian and frequentist statistics is between frontend and backend complexity. Bayesian gives you a clean interpretation of results at the cost of complexity in modeling and program specification. Frequentist statistics gives you a wonky interpretation of results but has more straightforward modeling/testing.

        4. 2

          This is a brilliant practical example of why bayesian approaches are more powerful. Thanks!

          1. 1

            For the frequentist hypothesis testing, we don’t treat the two as the same, we treat them as indistinguishable. There would be no evidence that supporting both was necessary. Or you learn that your data is insufficient for the purpose. For the Bayesian example, if the data is sufficient for a decision when hypothesis testing gives you no significant difference, then your prior is pretty strong in one direction. A different prior would produce other results.

            There are deeper philosophical/mathematical issues. It’s been years since I looked at this, so I may get some of the details wrong. The justification for probability as belief—that is, your expectations about outcomes are captured by one number in a framework—is based on a theorem from the mid-20th century that rational actors with a prior are Bayesian. Unfortunately (see the late chapters in Berger’s ‘Statistical Inference and Bayesian Analysis’) the obvious extension of allowing classes of priors (say, all priors where half the probability mass is <0.5, or only specific points are pinned based on partial information) makes those theorems fall apart.

            Probability as a formal system has multiple interpretations. Combinatorics uses it as a means for proving the existence of certain objects, which has an entirely different philosophical underpinning. I think the proper takeaway is that probability is a tool for formulation, but not the entire basis of inference. The usual place statisticians start from today is Wald’s decision theory. In decision theory, you are interested in admissible decision procedures, i.e., those that are not everywhere worse than some other procedure. There are usually a lot of different procedures that are better in some areas than others, so you need a criterion for choosing among them. Choosing the criterion depends on what you’re doing. Unbiasedness, minimax, Bayesian, and others are all criteria. Bayesian procedures, that is, the procedures corresponding to Bayes priors) occupy an interesting place in that the set of Bayesian procedures is larger than the set of admissible procedures, but not much larger, so you can often prove things over all priors and get properties over all admissible procedures.