stadistics (n.) the enjoyment of inflicting pain on others by inundating them with facts and figures.

Bayesian trickery

It is often claimed that regular, normal people naturally think like Bayesians. Leaving aside whether we should leave the foundations of our subject to the average punter, I suspect that this might be true. But it really depends on how you frame the question. Below is a description of a class discussion exercise used by Bill Jefferys, who is a Professor of Astronomy but also an adjunct professor of statistics, teaching a course in Bayesian statistics at the University of Texas.

See if you can spot the flaws.

I really support the use of both actual and thought experiments in class. In my undergraduate courses, I got told a great deal about what to do and how to do it, but almost nothing about why. Most mathematically savvy students are able to quickly see why something makes sense – this is probably why they are good at maths. Non-mathematical types often just cannot see the point. But epistemology is a complex and subtle field and should be given much more weight in standard statistics course.

Anyway, here is Bill Jefferys’ nice class exercise, which I found in a comment on Andrew Gelman’s blog and which I will now summarise.

Step1. Bring a dollar coin to class. Ask the class “If I flip it what is the probability that it will come up heads?” Everyone agrees it is 0.5 or very close to 0.5, though there might be some comment about biased flipping and sleight-of-hand.

Step2. Flip the coin onto the floor and immediately step on it. I do not know if it is heads or tails. Neither do the students. Ask again: “What is the probability that it is heads?” Most people will say 0.5, but those who have thought carefully about the foundations of frequentist inference might say that it is either heads or tails, but this cannot be quantified as a probability. Those that say 0.5 are thinking as Bayesians; the others are thinking as frequentists.

Step 3. Look at the coin without letting anyone else see it. Then say “I now know whether it is heads or tails. What is the probability that it’s heads?” Most will still say that it’s 0.5. They are still thinking as Bayesians. Their background information is personal and is not the same as mine. It is rational for them to be uncertain while I, having difference information, am certain.

Step 4. Announce that I have seen a head and ask them “What’s the probability that it’s heads?” This poses something of a conundrum, since many of the students will tumble to the fact that I might not be telling the truth; so many of them will offer a higher number, 0.8 or 0.9, but not 1.0! It may be argued that this is a Bayesian approach again as they are conditioning on “professor says it is heads” which is not the same as “It is heads.”

Step 5. Invite a student to look at the coin and announce what s/he saw. Usually the student will report the same thing you do!

Step 6. Let everyone take a look for themselves.

This is a great exercise. But Bill claims that

“This is an exercise in Bayesian thinking - it is legitimate to quantify your uncertainty about a state of nature by putting a probability distribution on it and conditioning on data.”

Stuff and nonsense. This is not even close to a justification of Bayesian statistics or a refutation of the frequentist paradigm - even if I trusted the thought processes of under-graduates as the basis for statistical inference. Here are the flaws.

Flaw 1. Statisticians are, or should be, aware of how careful wording of a survey question can produce almost any desired result. Bill is not a statistician by trade so this may not be front of mind for him. But imagine that throughout this exercise we ask an alternative question:

“Is this particular outcome a head or a tail? Yes or no?

We would get many more people saying “I cannot possibly say.” Bill’s question actually requires the students to give a probability! No wonder that so many students give him one. By forcing the students into a Bayesian straight-jacket, he is assuming what he is trying to prove.

Flaw 2. How about when he tells them that he saw a head? Would a frequentist update the inference to “condition on this data?” Well, we would not condition on the data. We would posit a model for the data. I might say to myself.

“There is a 90% chance that the professor will tell the truth. So the data “professor says it is a head” has probability 0.9 if the truth is heads and 0.1 if the truth is tails. The likelihood ratio is 9 to 1 in favour of heads.”

This is a pretty strong inference and it is not necessary to go the extra step, put a prior on head and tails, and convert this into a posterior probability for heads of 10/11.

Flaw 3. A more subtle bias in the exercise is that the students see the spinning coin. So on this occasion nature has performed a random experiment to determine the parameter. In this case, it is clearly reasonable - though not necessary - to take the parameter to be a random variable. Would the students think it is so obvious that the gravitational constant is a random variable? I think not.

As I have said in other posts I am not anti-Bayesian. However, I am anti-Bayesian trickery. I am sick to death of hearing it claimed that (1) people not brain washed by frequentist theory are automatically Bayesian, (2) frequentistis cannot make use of prior information. I will have a(nother) post on the second issue next week.


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

AddThis Social Bookmark Button

5 Responses to “Bayesian trickery”

  1. John Henstridge Says:

    I have often thought that anyone who does not sometimes think like a Bayesian cannot be in touch the the real uncertainties in data, while anyone who believes that the Bayesian approach is “right” cannot be in touch with the realities of making decisions based on data.

    It is surprising that many statisticians who use models all the time do not realise that the Bayes and Neyman-Pearson approaches are simply models for how you might use data and draw inferences. As George Box said, all models are wrong, but some models are useful. It is a bit much to expect one wrong model to be always useful.

  2. Patrick Cordue Says:

    I agree that this exercise is not helpful in the Bayesian-frequentist debate (which is not a particularly helpful debate either). It is an interesting experiment which does raise some interesting points for discussion.

    A point not noted in the post is that the question changes from step 1 to step 2 and later steps. In step 1 the coin has not been tossed and the question is about the unknown state of nature - the probability of getting heads with this coin assuming it is “fairly” tossed. After the coin has been tossed, the questions relate to the probability that the coin is in a particular state - either heads up or tails up (I assuming we don’t have a double tailed or double headed coin!).

    I know that the issue of pre-experiment probabilities and post-experiment certainties - which may be unknown - are often raised in Bayesian-frequentist debates. Certainly, many people find it perplexing that a 95% confidence interval doesn’t have a 95% probability of containing the true value AFTER the experiment has been done.

    I am happy to use Bayesian or frequentist methods - whatever suits the particular problem AND the participants. Certainly, for estimating the probability of heads for a given coin (which is repeatedly tossed) I much prefer a Bayesian estimator to the badly performing maximum likelihood estimator (which although minimum variance unbiased is actually very bad in terms of mean squared error).

  3. Possibly best for some of us to begin reading the posting with the comment near the very end:
    “As I have said in other posts I am not anti-Bayesian. However, I am anti-Bayesian trickery.”

    Throughout all these postings, no-one is suggesting that the existence of invalid arguments supporting something necessarily invalidates that something.

    I guess I’ve just got two or three or four minor comments in response. My comments are variously about people who have been placed in a `straight-jacket’ of giving over 100 probabilities a year for the past decade, about anti-Bayesian trickery, about the foundations of inference under model misspecification and about the distinction (and not all seem to think there is such a distinction) between inductive inference and prediction.

    None of these comments would seem to contradict anything already posted on this topic in this discussion, but I hope that at least some of these comments will be relevant and of sufficient interest.

    Whether or not what we invite people into is a partially or totally Bayesian strait-jacket , for 26 weeks of the year, we invite people to give probabilities on the outcomes of Australian Football League (AFL) footy games, as per http://www.csse.monash.edu.au/~footy and http://www.csse.monash.edu.au/~footy/about.shtml#info. As might be evident at http://www.csse.monash.edu.au/~footy/ladder/ladder.info.23.shtml , trained and able professional statisticians enter this competition and have a history of doing well. And, whatever the strait-jacket might or might not be, we also have a competition where one gives a mean and s.d. on the margin
    (about which more is said at http://www.csse.monash.edu.au/~footy/about.shtml#gauss).

    Let us not forget that there are also issues of classical trickery (or anti-Bayesian trickery) - and, if not trickery per se, then muddleheadedness. I am reminded of an argument I once read (published in a well-respected outlet in the past decade and due to someone still living who I’d rather not name, but possibly used and I think still used by others) which seems to be roughly as follows:
    “Maximum A Posteriori (MAP) is not statistically invariant under re-parameterisation. Indeed, we are not aware of any Bayesian methods of point estimation which are statistically invariant. [GRAND LEAP ABOUT TO FOLLOW.] Therefore there are no statistically invariant Bayesian methods of point estimation.”

    Suffice it to say that at least three statistically invariant Bayesian methods of point estimation get a mention in http://dx.doi.org/10.1093/bjps/axm033 (and the relative merits of various approaches are discussed throughout said article).

    Penultimately, re John Henstridge’s comment with George Box’s comment about all models being wrong but some models being useful, this raises the issue of model misspecification. Many methods of inference are not nearly as robust under model misspecification as when the model is correctly specified. There are results in this area, but the field strikes me as fairly new. For some negative results about many methods with hints at the end of some hope via different approaches, see Peter Gru”nwald & John Langford (2007): Suboptimal behavior of Bayes and MDL in classification under misspecification. Machine Learning 66 (2-3): 119-149 (2007).

    And, last, I’d like to raise the issue of inductive inference to the single best model vs prediction. The impression I get, as discussed in http://dx.doi.org/10.1093/bjps/axm033 and elsewhere, is that Bayesians are very happy and willing to do (weighted) model averaging (using the posterior) for prediction where many (classical) non-Bayesians currently seem loathe to do any model averaging for prediction but rather seem to wish to stay with one model.

    In sum, trickery is indeed a bad thing. But, in discussion of important fundamental mistakes, people will make mistakes - let’s just hope they’re genuine.

  4. Berwin A Turlach Says:

    “It is often claimed that regular, normal people naturally think like Bayesians.” Agreed, I heard this claim often too, and I do not remember being ever offered supporting evidence.

    But if this is really the case, why are so many people confusing the sensitivity/specificity of a test with the positive predictive value of the test? And there are plenty of other examples/experiments that show that that regular, normal people do not naturally think like Bayesians. The human species did not evolve to solve probability/inferential problems according to a normative approach without being trained in it.

    Another claim that is often made is “everybody can do mathematics, when we catch a ball, the brain subconsciously solves the DE describing the path of the ball enabling us to catch it”. While I know people who could not catch a ball if their life depended on it, I used to subscribe to this claim. Until I read Gigerenzer’s book “Gut Feelings: The Intelligence of the Unconscious”; he presents some convincing evidence that this claim might not withstand critical examination. This book might also prove enlightening to those who think that people are natural Bayesians.

  5. Bill Jefferys Says:

    Chris, you have way overinterpreted what I wrote.

    First, my experiment is not intended in any way to justify Bayesian statistics nor to refute frequentist statistics. It is merely a way to “break the ice,” so to speak, to get students comfortable with the idea of using probability to encode uncertainty about states of nature (what Tony O’Hagan, in an article referenced on Andrew’s blog, called “epistemic uncertainty”) as opposed to the uncertainty about stochastic processes (what he calls “aleatory uncertainty.”) The reason I do this is that this is what the students are going to be doing in my course for the rest of the semester.

    As for your characterizing my comment:

    “This is an exercise in Bayesian thinking - it is legitimate to quantify your uncertainty about a state of nature by putting a probability distribution on it and conditioning on data.”

    as “stuff and nonsense,” do you deny that Bayesians regard it as legitimate to quantify uncertainty about a state of nature by putting a probability distribution on in and conditioning on new data as it comes in? If you do deny this, what do you think Bayesians do?

    I agree with you that the students might react differently if I asked them your question, “Is this particular outcome a head or a tail? Yes or no?”, but since that’s not my purpose in doing this exercise (one that Andrew says he also uses), that would be inappropriate. I am, frankly, not in the business in this course of refuting frequentist ideas (which I in fact use frequently in my research), but only in introducing these students to Bayesian ideas.

    Finally, my comments about people being natural Bayesians should not be over-interpreted either. I merely mean to say that most people seem to be comfortable with the idea of expressing uncertainty about unique events, at least some of them, by using probability. And my experience with physical scientists, who regularly mistake confidence intervals for credible intervals (for example) and want to make obviously Bayesian statements such as “the probability that the Hubble constant is between 65 and 75 km/sec/mpc is 0.95″ (say) rather than a confidence interval kind of statement, reinforces this view.

Leave a Reply