20 August 2011

Fun With Epistemic and Aleatory Uncertainties

Deep in the comments on an earlier thread Paul Baer offers the following hypothetical:
In my statistics class, I ask my students "what is the probability that when I flip this coin, it will land heads." (And yes, assume it's a fair coin.)

Of course they answer 50% or some equivalent.

Then I flip it and hold it covered on the back of my hand. Then I ask, "What is the probability that this coin is heads." There's usually some puzzlement. Someone says "50%". And I say, "but either it's heads or it isn't. How can there be a fifty percent chance it's heads?"

Then I ask "what odds would you give me if I bet that it's not heads?" Eventually those who know what betting odds mean understand the point. Even when something has happened (like, the deck has been shuffled and the card that will be dealt could be known under some epistemic conditions DIFFERENT FROM OURS) we have to ACT as if the odds are, well, what we think they are.
To which I responded:
Consider the following case:

You flip a coin in your class and ask for the probability of a head. A savvy student replies:

[S1]: The odds of a head are 50-50

You then reveal to the class that the coin is not fair, in fact there is a 75% chance of a tail. You ask the student, now what are the odds of a head? (All while the flipped coin sits on your hand)

The student now replies:

[S2] The odds of a head are 25%.

Q1. Now would it be fair to say that [S1] was incorrect?
Answers gladly accepted.


  1. Yes, knowledge claims aren't special just because they are about uncertain parameters. We can be in error, or even ignorant, of what the distributions are. In fact, we'll always be in error, but sometimes the error is small enough in light of our purposes to be negligible.

  2. S1 is a prior. S2 is Bayesian posterior once the additional information (about the coin being biased) is known. Both were correct at the time they were uttered.
    Statements of probability (or odds) do not reflect the truth about one particular event (this flipping of the coin) but about the hypothetical result of a collection of such events (if you flip the coin 1000 times, it would be heads about 500 times). Even knowing that the odds of heads are 25:75 or whatever (even 999:1) does not keep the coin to fall in the opposite side this time. More precisely, there is no individual event capable of disproving or falsifying the statement that this coin has 75% (or whatever) probability of getting heads. The statement would have to fend for itself, on the basis of repeated events or on the basis of an examination of the physical qualities of the coin, regardless of the outcome in one particular throw.

  3. Yes, because any estimate of non-quantum probability is based on a lack of information. In the present instance, the student lacked information that the coin was biased.

    To a really sharp-eyed student, who could watch the coin's trajectory sufficiently closely and accurately predict its fall, probability would not be an issue.

  4. Did I say yes? I meant no. The student was correct to say the odds were 50:50. He lacked the extra information you have, just as in saying they are 75%, you lack information about the entirely deterministic fall of the coin. A computer, which could track the fall of the coin accurately and predict its fall, would say the odds are 100% or 0%.

    My point is, though, that any non-quantum probability estimate is merely a result of lack of information, not inherent uncertainty, stands.

  5. No. S1 was correct. And S2 was correct. (Also, negative questions like that are more difficult to comprehend than a positive, e.g., "Was S1 correct?")

    All probabilities have conditions baked into them, based on what we know. Without any other knowledge 50% is the correct guess (because it's not going to be on its side in his hands!).

    With better information, the correct probability changes.

    Also, because this topic hasn't quite become enough of a probability flame war, may I direct everyone's attention to the Monty Hall Problem?

  6. Hopefully this does not run contrary to blog protocol, but on the earlier post on this topic I posted the following:
    IMHO, [S1] was incorrect before [S2] and therefore remains incorrect after [S2]. The [S1 ]statement needs to be qualified otherwise it is an incomplete and, therefore, incorrect statement. If the student says [S3] "Since I do not know the nature of the coin or other factors influencing the probability of heads appearing, I do not know the probability of heads", then the student is correct. I do not think that putting "if" statements as conditionals will create a correct answer except for what amounts to a true explicit statement as to the probability of a head, i.e., you give the answer at the same time as asking the question. In these terms, Statement [S2] is also incomplete since it assumes that the instructor is telling the truth. The equivalent correct answer is [S4] "Since I do not know whether you are telling the truth or not, I do not know the probability of a head". If we do not know, we do not know.
    It all feels a bit solipsistic, but that maybe the price of dealing with the forecasting of future events.

    To continue the football example, asking me the odds of Bolton Wanderers winning the Premier League is an impossible question to answer "correctly" except by saying something that is equivalent to [S3] above. However, we know that people act as if they have an answer to this question by placing bets but their actions do not change the accuracy of the "I do not know" answer.

  7. Matt Briggs here (asked to comment from one of my readers).

    Q1. No. Since all probability is conditional on information of some kind, before the new information is revealed, the only information available is that there is a two-sided object, only one side of which will show.

    Conditional on that information, the probability is 0.5 for heads.

    But then the new information comes, which dictates that the probability is 0.75. That is, conditional on this new information, the probability is 0.75. But the probability is still 0.5 conditional on the first information.

    There is a third source of information: me. Once I flip, I can see the coin but you cannot. Conditional on the information I have, the probability is 0 or 1. But to you, it is still 0.75.

  8. The 50-50 answer is "correct" but only from the student's point of view and as a probability in the scenario. Armed with the information given it was the best possible answer, including the possibility that the coin was not fair (because there was an equal likelihood the coin was weighted to heads).

    Anyone making a rational bet on the outcome of a coin toss, in the absence of other information, has to assume 50-50. It would be crazy to make a bet on the assumption the coin was weighted towards tails before you had any further knowledge.

    So the student’s answer was a correct probability at the time it was made and with the information given. It is meaningless to talk about it being incorrect at any other time or for a person with more information.

    The student’s answer would only be incorrect if he made a prediction that later turned out to be untrue. In the scenario presented no prediction is made.

    But climate modellers are making predictions, directly or indirectly, and therefore they will be right or wrong. In them the errors are not in the real world outcome, which is inevitable. If the modelling is correct the outcome should follow from the inputs. If they get it wrong it will not be because the climate deviated from their predictions, but because they incorrectly calculated the inputs or the processes. Any error is theirs, not the climate’s.

    In the NZ high school Statistics curriculum they actually ask questions based on this for confidence intervals, and the students have to be able to properly distinguish between the probability the real answer falls within your range (wrong) or the probability your range includes the actual answer (right).

    Unless climate is actually chaotic, which the modellers have explicitly rejected.

  9. Even if you made the right conclusion based on the information you had, the conclusion is still wrong. So "Yes [S1] is incorrect."

    This illustrates the a difference between an error in outcome and an error in judgement or reasoning.

  10. -responses-

    Thanks all, very interesting responses. A few of my own:

    1. #3, I am postulating that the odds are 75-25, so there is no uncertainty there.

    2. There is a difference between a "correct probability" based on what is known at the time and a "correct statement" as judged over time.

    The statement S1 is objectively false. The fact that the student did not know this does not make the statement any less false.

    3. I am surprised not to hear the language of "objective" and "subjective" uncertainties.

    The statement was subjectively correct but objectively wrong. This is another way to express Matt Briggs' comment on the conditionality of the statement.

    Question: Does IPCC WG I produce objective or subjective probabilities? (Hint: the answer should be obvious)

  11. #11 Roger,

    "Objective vs subjective uncertainties": You did, in several posts. They were couched in terms of what the probabilities were based on. Had you peeked at the coin, you would have been able to give a heads/tails PDF of 0 and 1 (or vice versa).

    To use your terms, the IPCC WG I obviously produces subjective probabilities, and in more ways than you mean here.

  12. Thanks for using my example, Roger!

    I find it interesting that in your version of the example, the coin is already tossed and sitting on my hand. So, the answer "25%" is still a subjective estimate of the likelihood of the different possible outcomes of a single event which has no objective probability - the coin is now heads or not.

    If events have "objective" probability distributions which can be known, then subjective estimates of those probabilities can be "wrong", though of course that's a vague term, and especially vague for comparing functions rather than point estimates. However, given that for most of the things we care about there are no "true" (objective) PDFs, the important question is whether subjective estimates are reasonable, not whether they are right or wrong.

  13. -13-Paul Baer

    Hi Paul ... I get what you are saying, but I find the distinction between "reasonable" and "right/wrong" to be a difficult one to maintain outside of textbook situations.

    Consider the role of rating agencies and subprime mortgage-based securities. Most economists (and everyone else) would be perfectly comfortable expressing the view that the ratings of such securities were "wrong" even though ratings are expressed probabilistically.

    Think of it another way. What if under your hand you have a die, not a coin ... one side of the die has the word "heads" and the other 5 the word "tails".

    Would this make [S1] wrong?

  14. Paul:
    You wrote:
    However, given that for most of the things we care about there are no "true" (objective) PDFs, the important question is whether subjective estimates are reasonable, not whether they are right or wrong.
    Is not this the whole point at issue in Roger's original post? What does "reasonable" really signify? They are certainly "reasoned" but to what extent are they reliable and ultimately falsifiable? Should the IPCC be required to make predictions that are falsifiable within some meaningful time horizon?

  15. Roger,

    I think you're confusing the assignment of a probability with a prediction of the outcome. The probability estimate should be based on the factors known to affect the outcome.

    An accurate deduction of the PDF does not mean an accurate prediction of the event (the trivial case of P = 1 notwithstanding).

    In the case of the 5:1 die, assuming that S1 knew it, then yes, he was wrong. The original question, however, does not have the students predicting the outcome of the flip but assessing the probabilities of outcomes.

    Now, consider this new die. If you asked me to predict, I would predict a roll of tails based on the PDF. That prediction could still be wrong, even though I had predicted the correct probability of 5/6 for an outcome of tails.

    To apply this back to the original topic, the IPCC is, indeed making predictions from this standpoint. They may be based on a silly conditional (i.e., carbon emissions of such and such, etc), but they are still predictions. They might argue that this line of thinking isn't right because of all of the stipulations and various scenarios, but if they're so attached to weaseling out of what they write down, I'd ask why they wrote it in the first place.

  16. But we all know that there is very little statistics involved in IPCC predictions. Neither are they even predictions per se, more like worst case scenarios with usually only very unreliable models as backup. In simple terms; purely speculative guesses with a very pessimistic bias.

    Owing to the massive real uncertainties, virtually any guess is as good as any another. There is no pdf possible here! But the optimistic picture is rarely seen because that would not require any policy nor further research. And that would be like turkeys voting for Christmas!

    In fact if stats were to be properly used, instead of selectively picked model results, then whole chunks of the report would perforce disappear or else conclude that nothing much seems to be happening so far that is out of the ordinary. The entire edifice rests on models, hence types like Annan suddenly feel like cocks of the walk despite never having made any correct predictions nor having produced any model that is even adequate, never mind useful.

    It is entirely in climate modelers self-interest to try to tell us that wrong is right.

  17. Blogger ate a comment from Jonathan, which he sent in by email:

    "-11-Roger, my calculations are simpler than yours. My argument is as follows.

    There is some parameter p, bounded by the range 0-1; for the moment we have no idea what sort of object p is beyond these facts. The IPCC have some underlying PDF on this parameter, which they have summarised by saying P(p less than or equal to 0.05)=0.8. You ask for my best estimate of p.

    I want to answer this by evaluating the expactation value [p] by integrating over P. I don't know P, but I do know enough about it to determine my limits above.

    Your argument seems to be that I should do something different because p is (in fact) itself a probability. I don't see why that's relevant."

  18. Probabilities that depend only on an absence of information are always 'subjective'. There is nothing 'objective' in the 75% probability statement. The fall of the coin is deterministic, and the probability of 75% is assigned because, while you know the bias in the coin, you have insufficient information about the impulse you will give to the coin and its trajectory following that impulse. If you knew those, you could predict with full confidence the result of the coin toss.

    You are making an entirely specious distinction.