## 21 August 2011

### More Fun With Uncertainty Guidance

Here is a another math exercise.  In its AR4 report the IPCC says:
The uncertainty guidance provided for the Fourth Assessment Report draws, for the first time, a careful distinction between levels of confidence in scientific understanding and the likelihoods of specific results. This allows authors to express high confidence that an event is extremely unlikely (e.g., rolling a dice twice and getting a six both times), as well as high confidence that an event is about as likely as not (e.g., a tossed coin coming up heads). Confidence and likelihood as used here are distinct concepts but are often linked in practice.
Here are some specific definitions to help you answer some questions.

A. "high confidence" means "about 8 out of 10 chance of being correct".
B. "extremely unlikely" means "less than 5% probability" of the event or outcome
C. "as likely as not" means "33 to 66% probability" of the event or outcome

So here are your questions:

1. If the IPCC says of a die that it has -- "high confidence that an event is extremely unlikely (e.g., rolling a dice twice and getting a six both times)" -- how should a decision maker interpret this statement in terms of the probability of two sixes being rolled on the next two rolls of the die?

2. If the IPCC says of a die that it has -- "high confidence that an event is about as likely as not (e.g., a tossed coin coming up heads)" -- how should a decision maker interpret this statement in terms of the probability of a head appearing on the next coin flip?

Please provide quantitative answers to 1 and 2, show your work.

Paul Baer said...

There is insufficient information to do the calculation rigorously. Actual calculation would require assignment of the rest of the second-order probability (the "confidence") to different probability distributions.

In the absence of that information, a reasonable decision-maker might
(a) decide to ignore the second-order probability and use the existing PDFs ("extremely unlikely" for double sixes and "as likely as not" for the coin toss
(b) hedge against incorrect estimates of the "true" probabilities by assigning additional likelihood to "tail" events. In the coin toss event this doesn't really matter - if you only have two possible outcomes that are approximately equally likely, being somewhat wrong about the PDF won't make much difference for planning. But if you have reason to worry that an event someone called "extremely unlikely" might actually be just "unlikely", you might want to invest extra in hedging against it. (As well, of course, as extra research).

This just returns to my more general claim, that the issue for decision-makers is deciding what PDF to act "as if" is the "true" PDF, even when there really is no "true" PDF.

Hector M. said...

"how should a decision maker interpret this statement in terms of the probability of a head appearing on the next coin flip?"
In fact, nothing can be said about the NEXT coin flip or the NEXT die throw. Individual events may have any of the possible outcomes, and no particular outcome can possibly prove or disprove the accuracy of the probability assigned to an event.
What decision makers should deduce from a statement of probability (such as "the chances of two successive sixes is 1/36") is that IN THE LONG RUN, over a large number of of throws, a string of two sixes would be obtained (approximately) in one of every 36 throws. This is of course good guidance also for the NEXT throw, but what actually happens this time is totally uncertain (I daresay fundamentally indeterminate); probabilities refer to a large number of events, not to a single one.
On the other hand, the statement about the probability of 1/36 may be made with total confidence (100%) because we know the mathematical chances of a fair die, and we also know that this particular die is fair. This second condition may not be perfectly true (small physical imperfections in the die, small changes in the way the die is thrown by different people, etc.), but this uncertainty is probably very small.
In other sorts of events, such as the probability of having a hurricane in September at a certain location, we can say, first, that our assessment of such probability is, say, 60%, but we can also state that we have misgivings about our calculations, and put only little confidence in its accuracy: the "true" probability perhaps is not 60% but some other figure. If somebody acts upon such probability many times, i.e. over many Septembers, perhaps on average hurricanes occur less or more than 60% of the times. People should be informed of the degree of confidence I put on my probability assessment.
Now, of course, an assessment of the chances of an uncertain outcome that is made with little confidence on its accuracy is uncertainty twice over, and probably of little use at all unless I could put some confidence interval around my probability estimate (such as "the probability of a hurricane in September is assessed to be between 0.20 and 0.90 (or between 0.55 and 0.65) with a median estimate of 0.60"), but these are hardly the most frequent situations.

Jonathan said...

There is not enough information provided to make sensible estimates, but one can give some very conservative limits.

1) The probability of the event occuring is in the range 0.8*0.00+0.2*0.05 to 0.8*0.05+0.2*1.00, that is 1% to 24%. A reasonable best guess is 12.5%.

2) The probability of the event occuring is in the range 0.8*0.33+0.2*0.00 to 0.8*0.66+0.2*1.00, that is 26.4% to 72.8%. A reasonable best guess is 49.6%. (Note that this becomes 50% if you replace 0.33 by 1/3 and so on.)

Roger Pielke, Jr. said...

-1-Paul Baer

Thanks ... I agree with this answer, and have these thoughts about the implications:

(a) ignore the second order stuff (the IPCC generally does) -- bad idea, as this means ignoring potentially relevant info (see #3 below)

You write: "the issue for decision-makers is deciding what PDF to act "as if" is the "true" PDF, even when there really is no "true" PDF"

As I said on the other thread, I get what you are saying, but I have a hard time translating this to practical situations. I think that the decision calculus has to factor in what it means to be "wrong" in a probability judgment as related to the outcomes of a decision based on such judgments. Expressing the view that such judgments cannot be "wrong" is not the way to go.

I return to the notion that context matters in such judgments and flipping a coin has little in common with ratings of mortgage-backed securities.

Roger Pielke, Jr. said...

-3-Jonathan

This seems to be on the right track, but I would express the math differently.

Here is how I'd express the potential range for the answer to #1, given the assumptions:

Maximum probability of two sixes =

= (0.8*0.05 + 0.2*1.00)^2
= 5.76%

Minimum probability of two sixes =

= (0.8*0.00 + 0.2*0.05)^2
= 0.01%

Please check my math, but I think this is correct. If so, then this is a huge difference in probabilities.

dljvjbsl said...

============
1. If the IPCC says of a die that it has -- "high confidence that an event is extremely unlikely (e.g., rolling a dice twice and getting a six both times)" -- how should a decision maker interpret this statement in terms of the probability of two sixes being rolled on the next two rolls of the die
==============

This is the type of statement that fuzzy controllers must deal with. I know of someone who is doing this for the control of wind turbines. Sensors are placed upwind to determine wind conditions that will occur at the blades a short time in the future and the blades are set in anticipation of these events. Significantly improved efficiency can result. So if there is high confidence that a wind gust is unlikely then the blades can be set in a state that would be hazardous in a wind gust. These are set as fuzzy policies or as condition (with truth value) /action pairs. Fuzzy controllers like this have been used in commercial products for about 20 years to my knowledge.

The primary difficulty with this is not in an interpretation but in policy conflict. If multiple fuzzy conditions are active then the consequences of their suggested actions must be assessed. Some actions may be mutually exclusive and so the controller or decision maker must decide which if any of the action recommendations should be taken.

So, for me, decisions makers cannot take the IPCC statements as complete. They are simply conditions which must be tied to actions. he consequences of these actions must be assessed and any incompatibility between these actions must be accommodated.

This all assumes of course that the decision makers will take the IPCC assessments at face value. Some fuzzy controllers may be supplied with learning algorithms to determine how much salience to give in any sensors prediction. So if a sensor (like the IPCC in certain cases) makes invalid predictions then the truth value that the IPCC places in its predictions can be adjusted before use by the decision maker.

dljvjbsl said...

Extremely unlikely - by IPCC definition a range of 5%

Extremely unlikely qualified with a truth value if "high confidence) - calculated from the IPCC definitions as a range of 5.76%

5% and 5.76% seems to be reasonably equal. The wide range seems to come from the definition of 0 to 5%

Jonathan said...

-5-Roger, your numbers are simply the square of mine; beyond that I make no comment on your mathematics.

I haven't read the other threads carefully and so may have missed some previous discussion. My estimates were off the top of my head, not carefully though out.

Paul Baer said...

Tomorrow is the first day of my stats class. I'll try to run some of these experiments and report back. (I'll even video them if I can.)

William M. Briggs said...

1. The question/statement is ill posed and has no unique solution.

If there is an 0.8 chance the event has 0.05 probability, there are 0.2 chance that it will take other values not equal to 0.05. Which values? We are not told.

Since all probability is conditional, we have to supply the information that will allow us to calculate the probability of the event.

The statement does not allow us to infer which information is *the* correct information. But if we suppose that correct information is 0.1 chance of 0.01 probability and 0.1 chance of 0.1 probability, then the probability of the event is

Pr(Event | this information ) = 0.1 * 0.01 + 0.8 * 0.05 + 0.1 * 0.1 = 0.051.

Again, this is a unique solution to "this information" (which I pulled out of the hole). Change the information, change the probability.

2. Same solution. The statement is ill posed and does not posses a unique interpretation.

This is good for the IPCC, since whatever happens they can claim afterwards to have meant what was most in their favor.

Roger Pielke, Jr. said...

-8-Jonathan

Thanks, squared due to 2 rolls

bernie said...

Shouldn't we either define the event as 2 sixes or as an unlikely event with a probability of between 0% and 5%?

In either case, why squared since the rare event is by definition singular or becomes singular since "2 sixes" are treated as a single event?

Finally, I am unclear how one can meaningfully quantify the range of probabilities of the "not" extremely unlikely event that might occur the absence of an extremely unlikely event.

Given the above and the essential vagueness of the verbal definitions, I find it easier to stick to the dice example. If the event is 2 sixes then the "fair" dice probability of 2 sixes is (1/6)*(1/6)or ~2.8% However, the IPCC is only 80% confident that the dice are "fair" - otherwise why wouldn't they be 100% confident. That means that they are saying that there is the possibility that the dice are fixed to roll nothing but sixes, which gives us a further 20% chance of rolling two sixes. But there is also the chance that the dice are fixed never to roll sixes. So we have a range of probabilites according to the IPCC of rolling 2 sixes of between 2.2% and 22.2%. Of course if there is a meaningful possibility that dice are fixed, why would ayone listen to the IPCC?

bernie said...

Matt: On re-reading the IPCC statement, I think you are right. IPCC is covering all its bases. Like the Soviets, they never make mistakes.

Jonathan: I reread your comment and I essentially agree with your calculation though I still think that we have to add information so as to make the problems solvable - see Matt's comment.

UAN said...

The dice premise is a joke and full of hubris. Essentially before calculating any of the probabilities, the premise assumes we know there are only two die, each with 6 sides and each with a value of 1 to 6, so any roll of the dice will result in a value between 2 and 12.

The hubris comes in stating that we know all conditions/variables and they are contained in the two dice and constrained to 1-6 for each one. So we can go on happily patting ourselves on the back and arguing semantics and statistics as long as the numbers come up between 2 and 12.

But we are still talking about science and not probability and statistics. We can roll between 2-12 a million times over, but the first time we roll a 1 or a 13 or any number outside 2-12, then all those other million rolls of the dice are meaningless and wrong. Ask Sir Isaac Newton about how definitive his theory of gravity is today. (not saying it's not still applicable or useful in many day-to-day instances)

We are limited by our ability to observe, and with the climate system so incredibly complex, to think that our current observations allow us to correctly predict with any degree of certainty is beyond belief. The premise itself indicates a degree of certainty that science shouldn't have in itself.

By turning the question of consequences into probabilities that are never "wrong" (as if we are the house in a casino where we the odds favor us over time by a 1 or 2%, but that is enough to make us billions of dollars, even if some individual goes on a run and walks out a winner), then the science is never falsifiable -- or the consequence is moved so far out into the future, that we can never be proven wrong or the story is changed and so too the goal post. This is what politicians do. Not science.

Folks compare the consensus of AGW to the Theory of Evolution. Fair enough. But just because we understand evolution and know about DNA, etc., that won't do much good if you get certain types of cancers or something like ALS. Here's a list of disease we don't have cures for yet, despite Evolution being "settled" (not to mention how much we don't know about the human body or how the brain works, despite much more work and time having been spent on it than AGW):

jgdes said...

My goodness! All this babbling on about probabilistic statistics when for the most part we are talking about statements made based on a show of hands, and sometimes only one guy just plucking a number from the aether.

jgdes said...

Neither a handful of opinions, nor a carefully filtered handful of computer model outputs can be treated by probabilistic stats. Now that is really basic! This discussion is only relevant for any conclusions that were reached from analysis of the raw data; ie practically nothing in any of the IPCC tomes. The tacit assumption by Annan that there are any pdfs behind these IPCC statements is just wrong. He continually does this; writing papers and critiques entirely on the back of a fundamentally wrong assumption.

It is this simple: If your opinion, whether based on a model or not and whether you placed a guessed percentage on it or not, is that something will happen but it doesn't happen then you are just plain wrong. No stats can be brought into it unless stats was involved in reaching the opinion in the first place.

Roger Pielke, Jr. said...

Blogger ate a comment from Jonathan, which he sent in by email:

"-11-Roger, my calculations are simpler than yours. My argument is as follows.

There is some parameter p, bounded by the range 0-1; for the moment we have no idea what sort of object p is beyond these facts. The IPCC have some underlying PDF on this parameter, which they have summarised by saying P(p less than or equal to 0.05)=0.8. You ask for my best estimate of p.

I want to answer this by evaluating the expactation value [p] by integrating over P. I don't know P, but I do know enough about it to determine my limits above.

Your argument seems to be that I should do something different because p is (in fact) itself a probability. I don't see why that's relevant."

Mark said...

Do the climate scientists not usually give a range to a specified level of confidence? That is they might predict 1.5Âº to 2.5Âº warming within 20 years to 95% confidence.

In such cases the 95% limit should be more or less two standard deviations from the middle and the 99% limit about three.

Any deviation from the predicted range by more than another standard deviation could reasonably be assumed to be calculation error, not random fluctuations.

(Almost none of the distributions are normal, but it works as a rule of thumb. If your estimation is five s.d. off then you have stuffed up, regardless of the shape of the distribution.)

Jonathan said...

-17-Roger, thanks.

Note that the PDFs that leads to my extreme values are themselves extreme, being a pair of delta functions (spikes) of area 0.8 and 0.2 in each case (for the low end the spikes are at 0 and 0.05+, for the high end at 0.05- and 1). As such they are unrealistic, but they do provide bounds.

Experience has taught me that scientists almost always underestimate the true errors in any situation, so treating the errors they give you in the most pessimistic way possible provides a natural (if slightly arbitrary) correction for this.

-15-jgdes, I make no comment on how the IPCC may or may not have produced the underlying PDFs; I simply take the statements from Roger at face value.

I am disappearing from the web for a few weeks in an hour or two and so regret that I will be unable to continue this discussion.

Dikran Marsupial said...

Roger, your reasoning does not appear to be correct. Here is a simple Bayesian intepretation which demonstrates the error:

Let p be the probability of rolling two sixes in succession. Then the IPCC statement is quivalenet so saying that the probability that x < 0.05 is 0.8; and hence the probability that x > 0.05 is 0.2.

If this is the only information we have, then the least informative p.d.f., p(x), would be flat with a value of 16 for 0 < x < 0.05 and 0.2105 for 0.05 < x < 1 (once properly normalised).

Thus the minimum possibilty of two sixes is 0, as p(0) = 16 (this corresponds the the possibility that it is not a fair dice, which was not specified in the question). The maximum possibility of two sixes is 1, as p(1) = 0.2105 (perhaps every face of the dice has a six on it).

Note the IPCC statement does not specify a value for x, but a p.d.f. for x. This is because our knowledge of x is uncertain, so the proper thing to do specify a distribution defiining the relative plausibilities of different values of x. In this case, the statement doesn't rule out any value of x, so you are not correct in saying that there is a maximum or minimum probability of two sixes (other than 0 or 1).

This is actually what a competent rational policymaker actually wants, because he can then use statistical decision theory for working out the best course of action. Say he has two possible policies A and B, and for each there is a loss function, l_A(x) and l_b(x), which describe the cost-benefit of following each policy, for a given value of the unknown x. The expected losses, E_A and E_B can then be obtained by integrating the product of p(x) and l_A(x) over x=0 to x=1, and likelise for l_B(x). The rational course of action is then to choose the policy with the lowest expected loss.

It seems to me that your error stems from not appreciating the purpose of uncertainty, and hence misinterpret the (rather clear) statement.

Similarly the second sentence defines a different p.d.f for x, the probability of getting two heads. In this case, plugging in the numeric values you suggest, "high confidence that an event is about as likely as not", translates to

The probability that x lies in the range 0.33 to 0.66 is 0.5. So again, if that is all we know, then we should adopt uniform distributions between change points, the p.d.f. is

p(x) = 0.25/0.33 = 0.7576 for x < 0.33
p(x) = 0.5/(0.66-0.33) = 1.5152 for 0.33 <= x <= 0.66
p(x) = 0.25/(1-0.66) = 0.7353 for 0.66 < x < 1

remember this is a normalised p.d.f., not the probabilities of particular events.

Again the competent policy maker can propogate the effects of this uncertainty through his loss model to compute the expected losses of various policies and make a rational choice.

The statement provided by the IPCC, when properly interpreted, gives exactly the information that the rational policy maker needs. Its use for falsification of the science is much less straightforward AS THAT IS NOT THE PURPOSE OF THE STATEMENT OF UNCERTAINTY. If you want to know what conditions are required for the science behind a projection to be falsified, that is a different question, but one that can be directly answered.

Roger Pielke, Jr. said...

-19-Dikran Marsupial

Thanks much ... a quick reply, as Briggs comments above, there is no basis to assume one PDF over another. In fact the IPCC warns against assuming a uniform distribution: "Note that in some cases the nature of the constraints on a value, or other information available, may indicate an asymmetric distribution of the uncertainty range around a best estimate."

While the notion of a theoretically "rational decision maker" (particularly in the context of uncertainty, ignorance, and different "loss functions" among stakeholders with different power, resources, interests) is worth a discussion, I'll put it off until another time.

Thanks!

Dikran Marsupial said...

"how should a decision maker interpret this statement in terms of the probability of two sixes being rolled on the next two rolls of the die?"

In short, he/she should interpret it as a (heavily discretised) specification of a p.d.f. for the probability of two sixes. There are a variety of things he/she could use this p.d.f. to calculate depending on what it was they were trying to do.

There isn't enough information in the statement though to falsify the science on which it was based; for that you need a projection to specifically rule something out. If you were to ask the scientists I'm sure they would be able to refine the p.d.f. so that p(x) = 0 for some value of x, it is just that was not the purpose of the orignal statement, and it is an error to treat it as such.

Dikran Marsupial said...

Roger, you are not correct to say that there is no basis to assume one p.d.f. over another, there is MAXENT, for example, which suggests you should adopt the least informative p.d.f. (i.e. the one that adds as little information as possible to the information contained in the question). In this case, the p.d.f. I suggested is essentially the one MAXENT would give. You can adopt another one of course, providing you are willing to justify the additional information that your assumptions involve.

The IPCC caveat says "in some cases". Your objection is therefore invalid unless you can say why there are constraints on a value (in this case probabilities can go from 0 to 1 - so no problem there); or other information available (none was given - if it was given I wouldn't have used a uniform distribution), or may be assymetric (nothing in the question implied any assymetry). Sure we could use some other distribution, but that would involve some additional information that was not contained in the question. Note (assuming fair dice); whether you get two sixes in two rolls of a dice is a Bernoulli trial. The uninformative prior for the probability of a Bernoulli trial is a uniform distribution (usually represented as a B(1,1)), which suggests its usage here is pretty reasonable.

The notion of a rational decision maker is key to correctly interpreting probabilistic uncertainty. This is exactly what statistical decision theory is for, and any discussion that is not based on statistical decision theory is at best sub-optimal (if not actually irrational).

Roger Pielke, Jr. said...

-22-Dikran Marsupial

Thanks ... a decision maker who is risk adverse versus risk taking will be comfortable departing from a uniform PDF. As you write, "You can adopt another one of course, providing you are willing to justify the additional information that your assumptions involve."

Thus we agree, as this is my point as well. Not knowing the decisions context, there is no basis for you or I to assume one PDF over another. Context matters.

You write: "The notion of a rational decision maker is key to correctly interpreting probabilistic uncertainty."

Well in that case we might as well give up;-) Decision makers are rational in some versions of economic theory and perhaps in textbook games involving probability. In the real world they are sometimes "boundedly rational" (a la Simon) and typically subject to heuristics and biases (a la Tversky and Kahneman). More typically decisions are not based on such textbook reasoning based on ether frequentist or bayesian statistics but "expectations" clouded by "uncertainty" that is defined far more richly than conventional decision analysis would have it (e.g., a la Keynes, Knight, Schackle).

It is always interesting to see followers of a particular intellectual paradigm pronounce their views to be "correct" and any other views to be "wrong" when there is at least a century of intelligent discussion about such questions, with multiple worthwhile but inconsistent (with each others) paradigms being advanced.

Thanks

William M. Briggs said...

Roger, I had a fuller go at this over at my place. Those interested in the math (only) might want to see:

Logical Probability And The IPCC's Ambiguous Forecasts http://wmbriggs.com/blog/?p=4255

Dikran Marsupial said...

Roger, whether a decision maker is risk averse shows up in his loss functions l_A(x) and l_B(x), not in the p.d.f. p(x).

In the question as posed, there is no context. In the IPCC report however there is plenty of context (i.e. the science) that might suggest a different prior. However if you don't supply that additional information it is unfair to criticise me for not using it!

The point is that there are principles for choosing p.d.f.s for ill-posed problems, the principle methods being MAXENT and transformation groups.

Having mentioned statistical decision theory, it should have been obvious what was meant by a rational decision maker (i.e. someone that always chooses to minimises their expected loss). I personally think that is a reasonable definition of rational. Of course the real world clearly isn't rational, but one would hope that real world politicians would want to ask their scientists and economists to advise them on which policies werre most likely to minimise the expected loss (of course they may not follow their advice) - or am I expecting too much of them?

O.K. so your analysis isn't wrong, it just makes informative assumptions about the p.d.f. of x that remain unsaid and unjustified. I have justified mine - MAXENT; what is the justification for yours?

It appears to me that you actually use two p.d.f.s one to compute the "maximum probability" with point masses at 0.05 and 1.0 and a different one to compute the "minimum probability" with point masses at 0.0 and 0.05. The fact you have two p.d.f.s for the same thing should be a warning sign. Note also that your computations are computing expectations of some variable, so it is very misleading to describe them as minimum and maximum anything.

Roger Pielke, Jr. said...

-26-Dikran Marsupial

Thanks ... Matt Briggs gives the full answer that I would to your question on the math (same as #3 and #6 above):

http://wmbriggs.com/blog/?p=4255

This answers the question: if you vary the PDF as much as possible, what are the bounds in the resulting probabilities?

Dikran Marsupial said...

"if you vary the PDF as much as possible, what are the bounds in the resulting probabilities?"

Roger, that answer clearly isn't correct as the least informative p.d.f. has maximum and minimum of 1 and 0, so how can your figures be bounds? Hint, you can't compute bounds by computing expectations.

The original p.d.f.s that you get by varuing the p.d.f.s as much as possible also have maxima and minima of 0 and 1.

Roger Pielke, Jr. said...

-28-DM

What do you disagree with in this nice summary from Briggs:

". . . though we cannot come to an exact solution, we can find its bounds given the language we do have. First, we know there is a 0.8 chance that (Prob < 0.05): the lowest this can be is 0 (just in case (Prob < 0.05) means 100% certainty of 0), and the highest it can be is 0.05 (just in case (Prob < 0.05) means 100% certainty of 0.05). Thus 0.8 * (Prob < 0.05) is between 0 and 0.04.

Now the 0.2 chance. The probabilities available to us are those between 0.05 and 1 (or so it seems; the language is still ambiguous). This means 0.2 times whatever this is is bounded between 0.01 and 0.2.

Our solution is then

Pr(A | our information) in [0.01, 0.24]."

Dikran Marsupial said...

To add, I am trying to work out what it actually is you have computed. Rather than a bound, it is clearly an expectation, so it should be "expected" something. Are you trying to obtain something like a credible interval on x (the probability of observing two sixes)?

Roger Pielke, Jr. said...

-30- DM

Sure, that sounds fine ... above I used the term "potential range."

When people are communicating across different disciplines, it is always good to just assume that particular terms of art will not have a shared understanding.

Dikran Marsupial said...

Pr(A|our information) in [0.01, 0.24] is clearly in error, as for example we are not told that it is a fair dice. If all six faces have a six on it then the probability is actually 1. Note that the p.d.f. I gave doesn't rule that out, your analysis does, hence you are using information not given in the question.

However I think the main problem is that you have computed the expectation of a bound rather than a bound on the expectation. They are not the same thing.

I think part of the problem lies in your description of what you have computed, it clearly isn't a minimum or maximum probability. As the last operation in your analysis is computing an expectation (probability weighted sum) try describing what you are computed as being the "expected .... ". If you can explain what it is the expected value of, it might be easier to see where the communication problem lies.

Dikran Marsupial said...

Roger@31. What is the exact definition of a "potential range" (I trust you are familiar with the standard definition of a Bayesian credible interval)?

Dikran Marsupial said...

I think I may be getting there; I think the problem may have something to do with having a p.d.f. for the value of a probability. Talking about probabilities of probabilities often causes difficulties as it is easy to mistake the probabilty of a probability for the probability itself. Which I think may have happened here.

In computing the "maximum probability", you have taken a "worst case" p.d.f. with point masses at 0.05 and 1.0, and then computed the expectation. But this is not the "maximum EXPECTED probability" after having marginalised over the uncertainty expressed by the p.d.f.

Likewise for the "minimum probability" you have taken a "best case" p.d.f. with point masses at 0.05 and 0, and computed the expectation. But again there it is the "minimum EXPECTED probability", not the probability itself.

Thus you have formed bounds on the expectation of the p.d.f., but that is not the same thing as a bound on the actual probability itself (which as I have pointed out can be anywhere from 0 to 1).

To demonstrate that the probability can be anywhere from 0 to 1, is there anything in the original statement that definitively rules out any particular value of x? No (hence comments about unfalsifiability), so the range can't be any narrower than [0,1].

Roger Pielke, Jr. said...

-32-DM

How can the probability be 1.0 if there is an 80% chance that the probability of 2 6s is < 0.5?

Dikran Marsupial said...

Roger@34 If there is an 80% chance that the probability of 2 sixes is < 0.5, then there is a 20% probability that it is > 0.5, hence it is possible that the probability is 1.

Roger Pielke, Jr. said...

-35-DM

Apologies, I don't understand your point. Are you trying to make a point about the objective uncertainty?

If there is a 20% chance that the odds of 2 6s is > 0.05, then while 1.0 is in that range, there is no way that your expectation will be 1.0 based on this information. That is the calculus that you find at Briggs.

But perhaps I misunderstand your point, if so, sorry.

Dikran Marsupial said...

Roger @ 36 There seems to be at least one of my posts in your moderation queue that hasn't shown up yet, that is highly relevant to this (and posted before my post @ 35). It may be easier for the readers if that post were to appear.

I note you have inserted the word "expectation" that I didn't use in my post @ 35, nor in your post @34.

The probability of observing 2 sixes is not the same thing as the expected value of the probability (which can be viewed as a point estimate of the probability after marginalising over the uncertainty expressed by a p.d.f.).

This sort of difficulty abounds in discussions that involve probabilities of probabilities (in this case a p.d.f. describing plausible values). It is all too easy to confuse the probability (or an expectation) of a probability for the probability itself. I said the probability x may be 1, I did not say that the expectation of this probability under my (or any) p.d.f. was one.

The plausible values of the probability of 2 sixes go all the way from 0 to 1, given the information in the orignial statement. The original statement tells you that some of these values are more plausible than others, but it doesn't rule any of them out.

So the question is, what is the value of bounds on the expectation of the probability? I would argue that it doesn't tell you very much about the relative plausibility of different values of the probability itself. It is a bit like the standard error of the mean, which are bounds on plausible values of the mean, however the variability of the population itself is given by the standard deviation, not the standard error of the mean.

bernie said...

Dikran:
You ask: Are you trying to obtain something like a credible interval on x (the probability of observing two sixes)?

I think it is more like, "Are you trying to obtain something like a credible interval the probability of observing two sixes when you are not certain that the dice are fair?" or to use the event x, "Are you trying to obtain something like a credible interval on x, when you are not certain that your model for predicting x is complete or accurate?"

The entire exercise I think goes to show that if you cannot predict occurrence of a currently unlikely event (e.g., average temperatures > X, where X is significantly greater than current average temperatures, or the frequency of Level 5 strength hurricanes > Y, where Y is significantly greater than the current frequency of Level 5 hurricanes) with a high degree of certainty then your prediction will look pretty lame.

Roger Pielke, Jr. said...

-38-DM

Yes, you are talking about the coin under Paul Baer's hand (from the earlier thread) ... that is not what is the issue here.

Dikran Marsupial said...

Roger @ 40. No I am not talking about that issue. I am only talking about the intepretation of the statement of uncertainty; the true value of the probability of getting two sixes is entirely irrelevant. You have misunderstood my point (no problem, in the nature of scientific discussions).

Roger Pielke, Jr. said...

-41-Dikran

Thanks, and you are correct, then I don't understand your point. (Sorry!)

Maybe you can take the information in question #1 above and show me your math that results in 1.0 (?)

FYI, I'll be slow to respond today as it is the first day of classes here ... but not due to a lack of interest. Thanks.

Dikran Marsupial said...

bernie @ 39 The second definition you give is reasonable, i.e. we want a credible interval on the probability of observing two sixes given the uncertainties. You get that credible interval by determining the p.d.f. (as I did) and calculating the narrowest region that contains 95% of the probability mass.

If you do that you get an interval much wider than that calculated by Roger. That is because the expectation of a probability is not the probability itself. Note I am not talking about the true probability here, just the distribution of values of the true probability that are consistent with the statement.

As to predictions being lame, the prediction that "with high confidence (80%) it is extremely unlikely (<0.05%) that rolling a dice twice will give a six both times" is hardly lame for the simple reason that (assuming a fair dice) it is actually true (the probability is actually 0.0278). What is lame about that?

Dikran Marsupial said...

No problem Roger, as I said talking about probability distributions on probabilities is often rather difficult.

It is probably easier to start with intuition to see what we can agree on. Does the original statement rule out any value of x (the probability of rolling two sixes) as being implausible?

Roger Pielke, Jr. said...

-44-Dikran

Sorry, I don't understand what you are asking.

The original statement says that the IPCC is 80% certain that there is < 5% of rolling a die 2 times and getting 6s both times.

I suppose I could ignore the IPCC statement completely and posit instead that the probability of rolling two sixes is 1.0. My rejection of the statement is not precluded by the statement.

But I fail to see how you get to a p = 1.0 from the IPCC statement.

Dikran Marsupial said...

Roger. I have already used p to represent the p.d.f. of the plausible values for x (where x is the probability of getting 2 sixes), so I have reverted back to the previous notation. I am happy to change notation if it helps, but it needs to be done in one go rather than piecemeal.

I did not say that the value of x is 1. What I am saying is that the maximum value of x that is consistent with the IPCC statement is 1.

This is why I asked if the origional statement ruled out any value of x as being implausible (i.e. some x for which p(x)=0). If no such value exists then clearly all values of x from 0 to 1 are consistent with the statement.

I certainly don't want you to ignore the statement, just say what values of x are consistent with it.

Roger Pielke, Jr. said...

-46-Dikran

If the die has 6s on all six sides (p(x) = 1.0), then I would say that this makes the IPCC statement wrong.

In other words, the IPCC is saying that it believes probability of rolling two sixes to be between 0.01 and 0.24. If the probability falls outside this range, then this would be inconsistent with the IPCC statement.

Dikran Marsupial said...

Roger, I did not ask whether the IPCC statement was right or wrong (which is entirely irrelevant to the determination of what it actually means). I asked if the IPCC statement ruled out the possibility that x = 1.

It clearly doesn't. It says that there is an 0.8 probability that x < 0.05. Therefore the is logically a 0.2 probability that x >= 0.05. That is *all* it actually says. Thus if there is a 0.2 probability that x >= 0.05 that does not rule out the possibility that x = 1. It might be extremely unlikely, but it is not ruled out by the IPCC statement.

Note p(x) is a normalised p.d.f. not a probability, so p(x)=1 just says that all values of x are equally probable.

Dikran Marsupial said...

Roger "In other words, the IPCC is saying that it believes probability of rolling two sixes to be between 0.01 and 0.24."

No, that is your interpretation of what the IPCC is saying. However I have already pointed out that the 0.01 and 0.24 figures are not bounds on the probability of rolling two sixes, but bounds on the *expectation* of the probability of rolling two sixes. These are not the same thing, and substituting the expected value of a probabilty for the probability itself seems the basis of your misunderstanding.

bernie said...

Dikran:
The type of prediction is lame about the 2 sixes because the IPCC seldom provides an explanaion for 80% confidence. Given the dice example, the only reason that it can be 80% and not 100% is that the dice may not be fair. This obviously begs an important issue. Similarly, IMHO, when a group such as the IPCC attaches its own uncertainties to its own assessments, there is a need for considerable explanation - otherwise it is very difficult to understand what they actually mean. If somebody came to me with such a statement I would push very hard to understand (a) the source of their uncertainties and (b) why they are presenting finding in which they do not have "very high confidence".
I am not sure whether you are familiar with Roger's original Climate Change paper (Jonassen and Pielke (2011))which essentially quantifies and supports an earlier assessment of the IPCC AR4 findings by the InterAcademy Council (2010). I would be interested in hearing your take on the paper and the IPCC's approach to helping policy makers determine how to assess their findings.

Roger Pielke, Jr. said...

-48, 49-Dikran

OK, I'm playing along ... the set of things "not ruled out" by the IPCC statement is infinite.

Of course, the 0.01 to 0.24 on the expectation of the probability is my interpretation of what the IPCC is saying. What do you interpret the IPCC to be saying?

Remember, I am interested in evaluating the expectation.

Dikran Marsupial said...

Roger@51

"OK, I'm playing along ... the set of things "not ruled out" by the IPCC statement is infinite."

Sorry, but that isn't playing along as the wording is ambiguous. Answering the following question with an answer of "yes" or "no" would be more helpfull.

Does the IPCC rule out the possibility that the probability of observing two sixes being one, i.e. x = 1?

"Remember, I am interested in evaluating the expectation."

In that case it is hardly surprising there has been considerable confusion as you didn't actually use the word "expectation" until post 24, which is well after your analysis was presented in post 5.

In post five you calculated the "Maximum probability of two sixes = (0.8*0.05 + 0.2*1.00)^2 = 5.76%". Note the word "expectation" does not appear.

Now, the next question is *why* are you interested in the expectation. The distribution implied by the statement is bound to be heavily skewed, and so an expectation does not give a good summary of the distribution.

Dikran Marsupial said...

bernie@50 "The type of prediction is lame about the 2 sixes because the IPCC seldom provides an explanaion for 80% confidence."

That simply isn't true, the published guidance sets it out explicitly, e.g. "Very High confidence" means at least 9 out of 10 chance of being correct. That ought to be pefectly understandable by anyone.

I haven't read Rogers paper yet, just finished reading Judith's. Will be reading Roger's properly soon (have only skimmed it so far).

Roger Pielke, Jr. said...

-52-DM

"Does the IPCC rule out the possibility that the probability of observing two sixes being one, i.e. x = 1?"

No

"Now, the next question is *why* are you interested in the expectation."

Because I am interested in evaluating the knowledge claims made by the IPCC, especially in the context of events that are predicted probabilistically but for which there will only be one realization.

(Sorry for confusion from my replies -- I thought this was obvious given the context of these posts. I think I now see where we crossed wires, my fault.)

See: http://rogerpielkejr.blogspot.com/2011/08/ink-blots-ambiguity-and-outcomes-in.html

Dikran Marsupial said...

Roger@54

I'm glad you agree then that the IPCC statement does not rule out the possibility that x=1. However this directly contradicts the statement "Maximum probability of two sixes = ... 5.76%". How can the maximum probability be 0.0576 (as you derive in post 5) when it is possible for x to be 1, which is larger?

"Because I am interested in evaluating the knowledge claims made by the IPCC, especially in the context of events that are predicted probabilistically but for which there will only be one realization."

Bounds on the expected value of the probability does not give the information you need to do that. For example, consider a problem with a standard Gaussian distribution, N(0,1). You are given a sample of data, from which you compute a mean (i.e. the expectation) of zero with a standard error of the mean of 0.1. This gives a 95% credible interval for the expected value (or mean) of +- 0.2. If we then draw a single realisation from N(0,1) and use this to validate our model, and get a value of 1, this is outside the bounds on the expected value. Does this mean our model was wrong? No, the model was completely correct. N.B. Douglass et all (doi:10.1002/joc.1651) made essentially exactly that error.

If you want to falsify the models, you need a projection that rules something out. You can't do it with a projection that says what is likely to happen.

You can however compute the likelihood statistic for the corpus of probabilistic predictions, which provides a means of evaluating the IPCC projections. You could work out whether it showed statistical skill by permutation testing (i.e. randomly permute the statements with randomly chosen targets to get the distribution of liklihoods assuming that the IPCC were just guessing).

Roger Pielke, Jr. said...

-55-DM

You are correct, I should have written, "The IPCC statement implies a maximum expected probability of ..." Again, apologies for the blog shorthand.

On the evaluation question, if you tell me that your expectations are for a probability of 0.01 to 0.24 and I then show you a die with six 6s, then I am comfortable saying that your expectation was wrong.

As I wrote on the post I linked to above: "The typical mode of engagement with skeptics by many visible climate scientists is to argue how right they are (and wrong/evil the skeptics are) -- but what skeptics need instead is to hear what it would mean for climate scientists to be wrong." Which sounds a lot like your comment: "If you want to falsify the models, you need a projection that rules something out. You can't do it with a projection that says what is likely to happen."

I am less interested in falsifying or validating models (itself a questionable activity, see Oreskes et al. 1994) than I am in evaluating predictions/projections/expectations. Correct models can make bad predictions (i.e., you can lose money at a dice game even if you know the odds perfectly).

It gets more complicated when we go from textbook examples to the real world -- consider ratings agencies and subprime mortgage ratings, were the expectations in AAA rating wrong (even though expressed as likelihoods)? Of course they were.

IPCC projections have far more in common with ratings agency ratings than dice rolls, though I have come to realize that this position is not everywhere accepted ;-)

bernie said...

Dikran:
I understand in abstract that the IPCC has defined the label "Very High confidence" as "at least 9 out of 10 chance of being correct." but do you really think that this is meaningful guidance. How can a policy maker possibly determine whether such confidence is reasonable or unreasonable? Can you point me to where somebody has used a similar approach?

Dikran Marsupial said...

bernie@57 "How can a policy maker possibly determine whether such confidence is reasonable or unreasonable?"

The way to determine if whether such confidence is reasonable is to read the WG1 report, and if necessary follow the references and/or do a litterature survey of your own. Basically the way to know if it is reasonable or not is to understand climatology. Now most policy makers will be unable to do this, which is why the IPCC was formed to write a review and summary of the important findings from climatological research.

"Can you point me to where somebody has used a similar approach?"

Science makes probabilistic statements of this sort on a very regular basis. Generally they are made in journal papers for the benefit of fellow scientists who understand such statements without any difficulty. The terminology was specified in a form that is more easily understood by policymakers, but the basic form of the statements is not at all unusual.

Whenever a scientist gives error bars on prediction, he is making a statement that "it is highly likely that the observed value will lie within this range".

Dikran Marsupial said...

Roger@56 Leaving out the "expected" is not reasonable blog shorthand as it results in a complete change in meaning.

"On the evaluation question, if you tell me that your expectations are for a probability of 0.01 to 0.24 and I then show you a die with six 6s, then I am comfortable saying that your expectation was wrong."

Here you are confusing the statistical meaning of "expected" with the meaning of "expected" in normal English usage. This leads to an obvious error. Say I build my model of the results of rolling two dice and taking the sum by rolling two dice, say 100 times, and taking the sum. I just did exactly that (well simulatedit using MATLAB) and I get an expected value (i.e. the mean) of 7.36 with a standard deviation of 2.1905. This means the standard error of the mean is 0.21095, which gives a 95% confidence interval of [6.9219, 7.7981]. Now if I roll two dice and take the sum and get (say) 9, does that make my model wrong? No of course not, because the standard error of the mean gives error bars representing the sampling uncertainty in estimating the expectation, not the error bars on how the actual observations are grouped around the expected value, to do that you need the standard deviation. Using the standard deviation, you get 95% error bars of [2.9790, 11.7410], and the observation is clearly consistent with that.

"but what skeptics need instead is to hear what it would mean for climate scientists to be wrong."

Well why not ask/challenge them to make a projection that is *designed* to facilitate fasification?

"Correct models can make bad predictions"

Indeed the 28% figure you gave in an earlier blog article is the proportion of the IPCC statements that you would expect to be wrong ASSUMUNG THE IPCCs SCIENTIFIC POSITION WERE CORRECT. Sorry about the shouting, but that is quite an important point. The IPCC don't expect them all to come true, if they did they would not have used the terminology they used to quantify their uncertainty.

Roger Pielke, Jr. said...

-59-Dikran

Thanks again for the exchange ... a few replies.

1. "Here you are confusing the statistical meaning of "expected" with the meaning of "expected" in normal English usage."

Funny, I'd make the same comment right back to you as well;-) Question: Were the ratings agencies ratings of mortgage backed securities a wrong expectation? (please do answer)

This issue is explored a bit here:
http://rogerpielkejr.blogspot.com/2011/08/fun-with-epistemic-and-aleatory.html

2. "Well why not ask/challenge them to make a projection that is *designed* to facilitate fasification?"

You are new around here, I take it. I've given this a go for a while, e.g.,

Maybe you can have better luck ...

3. "The IPCC don't expect them all to come true"

Indeed. That is why I wrote about this: "What does it mean? Nothing too interesting, really -- science evolves and any assessment is a snapshot of knowledge in time. However, I suspect that some people will get excited or defensive to learn that by the IPCC's own logic, the report's future-looking findings could include 28% or more that will not stand the test of time"

You and I can agree to disagree that the IPCC understood and even had a consistent approach to uncertainty, despite having uncertainty guidance, as our paper shows.

Thanks again!

Paul S said...

bernie - 'Given the dice example, the only reason that it can be 80% and not 100% is that the dice may not be fair.'

I think this could be stated better because it would be hypothetically possible to have 100% confidence in a probabilistic forecast of an unfair die. In the climate system we don't have any a priori information about whether the 'dice' should be fair or not.

'Confidence' is a quantification of the extent to which we have well-constrained knowledge of the intrinsic probabilities of possible outcomes. That is, how well we understand the unique 'fairness' properties of the dice.

In answer to the questions in the post:

1) An asymmetric gaussian pdf with 80% of the integral covering <0.05 and a long tail up to 1.

2) A gaussian distribution pdf with 80% of the integral occurring between 0.33-0.66 and tails either side.

I think there is some suggestion upthread that a uniform distribution is preferred above gaussian in the absense of further information. This may be generally right though I would think in many cases, when e.g. 80% confidence in p<0.05 is asserted, it would be correct for p=0.06 to be significantly greater then p=1.

Dikran Marsupial said...

Roger@60 "Question: Were the ratings agencies ratings of mortgage backed securities a wrong expectation? (please do answer)"

It depends what you mean by "expectation". However I know very little about finance, so I probably couldn't give a useful answer to your question without you first providing considerable background so that I understood the question.

I note however that I gave a specific example that demonstrates that the observations of a stochastic approximation should not be expected (normal English usage) to lie withing the error bars of the expectation (statistical usage) and that you did not respond to it.

As to getting the modellers to give you a falsifiable projection, how about being constructive by first proposing such a projection and asking if the modellers agree that it would falsify the theory. For example, "if CO2 rose over the next century according to A1F1, but global temperatures fell by 1 degrees over the next fifty years, in the absence of large scale changes in other forcings, would that falsify the theory?". I suspect most climate modellers would say yes.

Now I don't need to ask the climatologists as I can see the models are clearly falsifiable, and looking at the spread of the model runs it is immediately apparent to me what would constitute a falsifying event. If you are unhappy with that, then I am not in a position to articulate your objections.

Of course we can agree to disagree that the IPCC understood and had a consistent approach to uncertainty. Personally I think it is a rather minor issue, as the model runs are archived, so if you want to perform a proper statistical treatment of the uncertainty of the projections, rather than rely on brief summary statements, there is nothing to stop you from doing so.

Now if you want to assess the reliability of the IPCC reports, then you have made a start by estimating the expected number of projections that turn out to have been incorrect. Next compute the variation around the expected value we should expect to observe assuming the IPCC science is correct. If the number we actually observe falls within the error bars, then we can conclude that the IPCCs statements of uncertainty were probably well calibrated. If it is significantly higher, then they probably under-estimated the uncertainty; if significantly below then they have probably overstated the uncertainty. That would seem a reasonable approach to me.

Roger Pielke, Jr. said...

-62-DM

Thanks, my last comment for a while, a busy day ahead ..

1. "It depends what you mean by "expectation"."

Indeed

2. "... you did not respond to it ..."

We have had a fine exchange, please don't play blog games by asking questions such as "do you believe 2+2=4?" and then making an issue out of a non-answer.

3. "I suspect most climate modellers would say yes. "

The issue is not what is falsifiable over 100 years, but over more human time scales -- 5, 10, 20, 30 years.

4. "Now if you ... approach to me"

Thanks.

Thanks for the exchange, please do drop by from time to time.

Dikran Marsupial said...

PaulS@61 Indeed: For example if you inspected the die and saw it had no sixes and two ones, you could say with 100% confidence that the probability of two rolls of the dice summing to 12 would be zero.

Your intepretation of the IPCC statements looks perfectly reasonable to me. While I assumed a piecewise uniform p.d.f. for MAXENT reasons, you have justified why you would choose a different p.d.f. The important thing though is that we both agree that the statements are descriptions of p.d.f.s, although it is not possible to specify the exact p.d.f. without making additional assumptions.

Dikran Marsupial said...

O.K. Roger, message received and understood.

bernie said...

PaulS:
Obviously any dice may be fair or unfair. The issue is that if on the roll of a dice you are 80% confident that the odds of the dice coming up 2 sixes are the same as if they are fair dice, why are you not 100% confident and what implications does that have for the chances of rolling 2 sixes.
The dice example allows a clear separation between the probabilities of an event if certain assumptions are met, while the confidence presumably gives the probability that those assumptions will in fact be met. With the dice example, the critical assumption is that the dice are fair. With other IPCC predictions/findings it is difficult if not impossible to list the assumptions that are being made.