25 October 2011

The Games Climate Scientists Play

[UPDATE #3 11/2: A follow-up post is here.]


[UPDATE #2: I will be moving on to more (less?) fruitful topics. But let me wrap up this interesting episode by restating that I stand by everything in this post and the discussion in the comments here and elsewhere. The RC11 methodology does not make any use of data prior to 1910 insofar as the results are concerned (despite suggestions to the contrary in the paper). If there is a criticism of this post to be leveled it is, as several professional colleagues have observed in emails to me, that 1911 is not the right cutoff for the cherrypick, but it is more like 1980 (i.e., they argue that no data before 1980 actually matters in the methodology). This is a fair criticism. I'll be using the RC11 paper in my graduate seminar next term as an example of cherry picking in science -- a clearer, more easily understandable case you will not find.]

[UPDATE: At Real Climate Stefan Rahmstorf has a long and laborious post trying to explain not only the 1911 cherry pick, but several others that defy convention in attribution studies. In the comments below I publish Stefan's response to my query -- They used "trends" (using a new definition of that term in climate science) such that the "trend" from 1911 is the same as that from 1880.  Look at the graph below and ask yourself how that can be -- Climate science as ink blot.]

Here is another good example why I have come to view parts of the climate science research enterprise with a considerable degree of distrust.

A paper was released yesterday by PNAS, by Stefan Rahmstorf and Dim Coumou, (also freely available here in PDF) which asserts that the 2010 Russian summer heat wave was, with 80% probability, the result of a background warming trend. But if you take a look at the actual paper you see that they made some arbitrary choices (which are at least unexplained from a scientific standpoint) that bias the results in a particular direction.
Look at the annotated figure above, which originally comes from an EGU poster by Dole et al. (programme here in PDF).  It shows surface temperature anomalies in Russia dating back to 1880.  I added in the green line which shows the date from which Rahmsdorf and Coumou decided to begin their analysis -- 1911, immediately after an extended warm period and at the start of an extended cool period.

Obviously, any examination of statistics will depend upon the data that is included and not included.  Why did Rahmsdorf and Coumou start with 1911?  A century, 100 years, is a nice round number, but it does not have any privileged scientific meaning. Why did they not report the sensitivity of their results to choice of start date? There may indeed be very good scientific reasons why starting the analysis in 1911 makes the most sense and for the paper to not report the sensitivity of results to the start date.  But the authors did not share that information with their readers. Hence, the decision looks arbitrary and to have influenced the results.

Climate science -- or at least some parts of it -- seems to have devolved into an effort to generate media coverage and talking points for blogs, at the expense of actually adding to our scientific knowledge of the climate system. The new PNAS paper sure looks like a cherry pick to me. For a scientific exploration of the Russian heat wave that seems far more trustworthy to me, take a look at this paper.

113 comments:

  1. Looks like your like points to a paper that's behind a paywall. Is this the one? http://wattsupwiththat.com/2011/07/13/peer-reviewed-paper-2010-russina-heat-wave-mostly-natural/

    ReplyDelete
  2. Why not 150 year which seem to be the gold-standard in Climate Scientology?

    Lies, Damned Lies and Statistics...

    ReplyDelete
  3. Dole et al. explain:

    "The July surface temperatures for the region impacted by the 2010 Russian heat wave shows no significant warming trend over the prior 130-year period from 1880 to 2009."

    ReplyDelete
  4. This might be a bit off topic. I noticed that the 1930's were the hottest. I recall that this was also true for North America. Can't help but think that this would effect arctic sea ice. Any record of arctic sea ice for the 1930's?

    ReplyDelete
  5. Just sent this to Stefan:

    "Hi Stefan-

    A quick question -- in your paper on the Russian heat wave you have this passage:

    "With a thus revised nonlinear trend, the expected number of heat records in the last decade reduces to 0.47, which implies a 78%
    probability [0.47 - 0.105/ 0.47] that a new Moscow record is due to the warming trend. This number increases to over 80% if we repeat the
    analysis for the full data period in the GISS database (i.e., 1880-2009), rather than just the last 100 y, because the expected number for stationary climate then reduces from 0.105 to 0.079 according to the 1/n law."

    My question is: What trend value did you use for 1880-2009? And based on this trend, what are the expected number of heat records in the
    past decade?

    Many thanks,

    Roger"

    Let's see what he says.

    ReplyDelete
  6. Roger -

    Why would 130 years be less arbitrary, scientifically, than 100? Is there some aspect of data gathering that changed 130 years ago?

    ReplyDelete
  7. -6-Joshua

    It may not be less arbitrary, but given that the data exists further back in time, why not use it?

    As Dole et al. tell us, there are no upwards trends in temperature since 1880. Thus, there should be a reason to select a subset of the data to present trends. RC11 don't explain that choice and thus leave themselves open to critique.

    ReplyDelete
  8. Sure, the critique seems reasonable, but are there data available that extend back prior to 1880?

    You're using the data period that Dole et al. used as a comparison - but what lies behind their methodology of data gathering? Without further information, I don't understand why you wouldn't apply the same reasoning to Dole et al. If the data simply don't exist prior to 1880, then it's a different matter.

    ReplyDelete
  9. Sorry, Roger -

    I didn't see your comment in the thread. So the GISS database that they used in their own paper extended back 130 years but they only determined the trend for the last 100 years - and they didn't offer an explanation?

    ReplyDelete
  10. -9-Joshua

    Correct, though they did offer some mystical words about "non-linear" trends. Let's see what Rahmstorf says in response to my query (or not) ...

    ReplyDelete
  11. Because July 2010 is by far the hottest on record, including it in the trend and variance calculation could arguably introduce an element of confirmation bias. We therefore repeated the calculation excluding this data point, using the 1910–2009 data instead, to see whether the temperature data prior to 2010 provide a rea-
    son to anticipate a new heat record. With a thus revised nonlinear trend, the expected number of heat records in the last decade
    reduces to 0.47, which implies a 78% probability [ð0.47 − 0.105Þ∕0.47] that a new Moscow record is due to the warming trend. This number increases to over 80% if we repeat the analysis for the full data period in the GISS database (i.e., 1880–2009), rather than just the last 100 y, because the expected number for stationary climate then reduces from 0.105 to 0.079 according to the 1∕n law.


    from the rahmstorf paper

    ReplyDelete
  12. Analyse using standard deviations assuming no trend, then.
    You will arrive at 100% chance instead of 80%.

    ReplyDelete
  13. 80% is not a significant correlation in the first place. Add to that the cherry picking, and this is yet another in a long list of examples of the AGW con:
    Big claims made in a high profile way that turn out to be junk.
    Yet the AGW community takes to this sort of faux science like ducks to water.
    It is not so much that faux science gets promoted. That has become a boring disappointment. The interesting thing is that the AGW community accepts it time after time, no matter how silly and infantile it gets. I think of the great paper on how CO2 would cause more kidney stones in the US. Or how tropical constrictors would move their range to Kansas. Or how Tibetan glaciers would be gone and the rivers they feed run dry (which is still getting tossed around).
    That the AGW community has so much social power and accepts this level of bogosity is like a child playing with daddy's 9mm.

    ReplyDelete
  14. -11-ob

    That statement is not correct, they do not repeat the analysis for 1880-2009. I will post up Rahmstorf's response to my query on this point in the next comment.

    ReplyDelete
  15. -5-Me

    Here is Stefan's reply:

    "Roger, we did not try this for a linear trend 1880-2009. The data are not well described by a linear trend over this period.

    Stefan"

    The reason why they ignored trends 1880-2009 is that there is no trend!!

    ReplyDelete
  16. The reason Stefan uses that time period is because it is that time period that is *described by* a linear trend, and therefore can be used for an analysis of the statistics around a linear trend. There is no claim that the 1880s were not warm (which is well known). They were, however, not as warm as 2010. nor does that impact the point he is making, which is that the probability of extremes has increased. Furthermore, it turns out that the paper you cite -- Dole et al. -- made an (evidently) incorrect adjustment for the urban heat island effect, turning summer warming into summer cooling. Look at Stefan's post on RealClimate.

    ReplyDelete
  17. From Stefan's post:

    [The Dole et al. adjustment] leads to a massive over-adjustment for urban heat island in summer, because the urban heat island in Moscow is mostly a winter phenomenon (see e.g. Lokoshchenko and Isaev). This unrealistic adjustment turns a strong July warming into a slight cooling. The automatic adjustments [as used by Dole et al.] used in global gridded data probably do a good job for what they were designed to do (remove spurious trends from global or hemispheric temperature series), but they should not be relied upon for more detailed local analysis.

    ReplyDelete
  18. -16-Eric

    Can you point me to any other attribution study ever published that uses the definition of "trend" applied by Rahmstorf?

    You cannot. It is a method devised especially for this paper and defies convention in this field.

    They say that "trends" from 1880 are identical to those from 1911 -- yeah, right.

    I am sure that Anthony Watts is having a good laugh at Rahmstorf's new-found problems with UHI adjustments ;-)

    Funny!

    ReplyDelete
  19. Why the surprised reaction? The cherrypicking was evident already in AR4 WG2. Ends justify the means.

    ReplyDelete
  20. Roger: don't change the subject. Stefan's paper is simple and clear, and anyone reading it will see that you are simply making up your own facts.

    ReplyDelete
  21. -19-Eric

    So I take it that you cannot in fact point to any other study ever published that uses this definition of "trend" in an attribution study?

    Here is another question for you:

    The addition of data from 1880 to 1910 to the analysis changed the trend (and expected probability of extremes in the current decade, using Ramstorf's unique definition of "trend") as compared to the analysis starting in 1911 by:

    a) zero

    b) a non-zero amount

    The answer is a) of course, surely you agree. Their methodology is a cherrypick because the 1880-1910 data has absolutely no impact on a trend calculation. Convenient of course for their argument, and completely contrary to convention in attribution studies.

    Replies to these questions welcomed, Thanks!

    ReplyDelete
  22. I just left these questions at Real Climate:

    "Stefan-

    Could you answer the following two questions?

    The addition of data from 1880 to 1910 to the analysis changed the trend (and expected probability of extremes in the current decade, using your unique definition of “trend”) as compared to the analysis starting in 1911 by:

    a) zero
    b) a non-zero amount


    A second question for you:

    Please point to another attribution study — any ever published — that uses the same definition and operationalzation of “trend” that is introduced in this study.

    Thanks!"

    ReplyDelete
  23. -10- Roger

    So they did offer an explanation? That seems to be in contrast to what you said up-thread.

    ReplyDelete
  24. Now this is getting interesting. Eric Steig replies over at RC that the addition of data from 1880 to 1910 does in fact change the trend and expected probability of extremes in the current decade. As this data was not reported in the paper will be interesting to see what numbers they provide.

    ReplyDelete
  25. It isn't interesting, there's nothing to see here. You just did not read the paper.

    ReplyDelete
  26. -25-chriscolose

    Surely you can do better than this, maybe, e.g., say why it is not interesting?

    If you want to debate whether or not I read the paper, that is a debate I will win ;-)

    ReplyDelete
  27. -23-Joshua

    Indeed it is different than what I said ;-)

    Let us see what numbers they provide ...

    ReplyDelete
  28. This is the follow up that I left at RC:

    "Roger Pielke, Jr. says:
    Your comment is awaiting moderation.
    26 Oct 2011 at 11:46 AM

    Thanks Eric,

    What then is (a) the value for the trend starting from 1880 vs. 1910 and (b) the expected probability of extremes in the current decade started from 1880 (it was 0.47 starting from 1911)?

    Thanks!"

    ReplyDelete
  29. - 27 - Roger,

    I find a lot of what you write to be interesting. I have recommended your examination of the economic costs/benefits of CO2 restrictions to friends.

    But you have to understand that when you make a statement like you did above, and further generalize based on that mistake to make a widespread characterization of a large number of scientists:

    "Here is another good example why I have come to view parts of the climate science research enterprise with a considerable degree of distrust."

    ...it doesn't exactly inspire confidence.

    We all make mistakes. But if you make a generalization about others based on your own error, then I think that you need to do more than simply acknowledge that mistake in isolation (as the mistake impinged on a non-discrete conclusion).

    I would suggest that you need to look at what would cause you to attribute your own mistake to a general trend that lies beyond your own actions. My suggestion is that it would, logically, reflect a confirmation bias on your part.

    It doesn't have to - but certainly suggests that could be the case.

    ReplyDelete
  30. Real Climate has approved my comment in -28- without a reply.

    ReplyDelete
  31. Roger, it is you playing games...posting hostile blog posts and ranting about your huge distrust in climate science, and then acting all nice and curious when you go over to RC. This is all supplemented by accusations that are simply false. They talk about the difference between 100 yr vs. the whole time-series right in the paper (the probability of a new extreme due to the warming trend changes relatively little).

    I meant to imply that what Eric said wasn't interesting (since he was just responding to claims based on not understanding the paper), and the childish "I'm going to come back to my blog and post what he said, then I'm going to go there and say something else" is less interesting still. But whatever. It's the blogs and anything goes. I just think an apology is in order to those people actually publishing on this.

    ReplyDelete
  32. -29-Joshua

    Thanks ... I can assure you that my generalization is not based on this one paper or this one interaction.

    Have a look at Chapters 7 and 8 of The Climate Fix for a broader examination of the issues.

    And no, I don't trust some of these folks. Sorry. It is a quantity easily lost and hard to regain. Does it color how I view some of their work? Absolutely, how could it otherwise?

    What mistake is that you refer to? I stand by my critique ...

    Thanks!

    ReplyDelete
  33. -31-chriscolose

    I post my responses to Real Climate here because in the past they have been disappeared. Maybe they don't do that anymore, but based on my experience, I am playing it safe.

    The paper does not present the numbers on what the expected number of heat records is in the past decade based on starting the trend in 1880 rather than 1911. I have asked Eric for this information. If you know what it is please say so, but I guess you don't.

    I have asserted that this number does not change, because, as I read their paper, their methods make the period before 1911 irrelevant. Could I be wrong? Sure. Let's see the numbers that I have asked for at RC.

    ReplyDelete
  34. -32 Roger,

    I didn't suggest that your generalization was based on this one dispute alone.

    What I was talking about was that you discussed that this dispute was an example of a larger phenomenon.

    That's interesting.

    Because what you referred to was that a cherry-pick from a larger dataset without explanation was a characteristic attribute of the work of a large subset of science and scientists.

    As it turned out, an explanation for the selection of a subset of the data was provided. Whether you accept the explanation as valid or not, it doesn't change your mistaken assertion that no explanation was offered. Selecting a subset from a larger database without explanation is a substantially more serious breach of scientific ethics than selecting a subset from a larger database and providing an explanation that some might not think is valid.

    I think that you would agree?

    So - what we have is that you made a mistaken assertion, and used that mistaken assertion as a validation for a larger generalization.

    In other words, what I think is interesting ist that in a sense you were right - this example could be representative of a larger phenomenon in the sense that you made an assertion that turned out not to be true.

    Again - that suggests to me confirmation bias. You already felt that the larger generalization was valid, and thus you found a breach of scientific evidence which, in fact, didn't exist.

    It shows to me a rather stark potential for confirmation bias on your part. That would suggest that you should examine other situations where you drew conclusions about the relationship between discrete phenomena and a larger generalization for potentially similar confirmation biases.

    Be that all as it may - here's the bottom line for me. I find some of your work quite interesting. At some level, you evaluate aspects of the climate debate that are above my level of technical understanding. As such, the integrity of your approach to the science is key for me. This situation, as I said, does not enhance my confidence in your analyses.

    Saying something on the order of "Yes, I made an accusation that was untrue, and that is at least potentially true because of a possible confirmation bias. I need to be extra careful about the possibility of confirmation bias on my part in the future." would help to increase my confidence in your work.

    Of course, I'm only one reader, of little importance....

    ReplyDelete
  35. -34-Joshua

    Thanks but you lost me -- you assert "As it turned out, an explanation for the selection of a subset of the data was provided."

    I missed that. Please point me to it. Why 1911?

    Where I see that I am being told that I am wrong in my critique is that some assert that they did not use a subset of the data, but all of it. And I think that claim is wrong.

    My simple question that could resolve this sits unanswered at RC.

    ReplyDelete
  36. - 35 - Roger


    "I missed that. Please point me to it. Why 1911?"


    This is probably getting more complicated that it's worth. Perhaps I am just confused.


    I'll try one more time to explain, briefly. I'll look for your response, but then I'll let it drop.


    Here is my confusion...


    I said the following:


    "I didn't see your comment in the thread. So the GISS database that they used in their own paper extended back 130 years but they only determined the trend for the last 100 years - and they didn't offer an explanation?"


    To which you replied:


    "Correct, though they did offer some mystical words about "non-linear" trends. Let's see what Rahmstorf says in response to my query (or not) ..."


    In response, I later said:


    "So they did offer an explanation? That seems to be in contrast to what you said up-thread."


    To which you then replied:

    "-23-Joshua

    Indeed it is different than what I said ;-)"



    I took that to mean that you acknowledged a mistaken assertion.

    ReplyDelete
  37. -36-Joshua

    Looks like we both got lost there ... I was referring to Eric Steig's claim that the trends and the expected number of heat records actually change based on using the full dataset.

    I do not thing this to be the case based on my reading of the paper, so that is what I was referring to ... I left a comment at RC asking for the specific numbers and that comment has gone unanswered.

    Sorry for any confusion!

    ReplyDelete
  38. Dr Pielke
    I fear that you may not realise that 'Joshua' and 'chriscolose' are well-known trolls (noted at other blogs - possibly even members of the [Warmista] Rapid Reaction Team). I note that the last 7 comments are either from them, or your replies to them. They have successfully, as intended, hijacked this thread.
    If you realise this, and nonetheless believe in engaging them, then my apologies.

    ReplyDelete
  39. Another RC blogger steps in and replies but does not answer the question that I posed in #28 above:

    "[Response: You're missing the argument about the importance of nonlinear trends, particular wrt the large increase in warming since 1980, in driving the pattern of expected and observed extremes, as discussed on pages 3-4 of the article. You seem to be thinking only in terms of a linear trend. With a nonlinear trend, what happened from 1880 to 1910 is relatively less important than it would be in an analysis based on a linear trend --Jim]"

    In response I just submitted this, repeating my request:

    -----------------------
    Thanks Jim,

    I understand what you are saying, but I am asking for some numbers to back up Eric's claim in #9 above.

    Can you provide the numbers that show how (a) the addition of 1880-1909 alters the trend calculation (from that starting in 1911) and (b) based on adding 1880-1909, how that changes the expected number of heat records (from the 0.47 based on 1911).

    My assertion is that the addition of 1880-1909 changes neither of these values, and Eric said otherwise. So I'd like to see the numbers.

    I agree with you 100% that "With a nonlinear trend, what happened from 1880 to 1910 is relatively less important than it would be in an analysis based on a linear trend".

    What I would like to see is the quantification of "relatively less important" in this case. Can you provide the numbers behind (a) and (b)?

    Thanks!
    -------------------------

    ReplyDelete
  40. Gosh the way these scientists are reacting to Roger's simple question in methodology, one gets the impression they believe they reside on Mount Olympus.

    ReplyDelete
  41. -38-Ir'Rational

    Thanks, all are welcome here -- if they are trying to hijack the thread (not a usual occurrence here;-) then they have some work cut out for them, I'm focused on getting the numbers from RC;-)

    ReplyDelete
  42. Looking for "non-linear trend" on Google and this is currently the top hit

    http://climateaudit.org/2009/07/03/the-secret-of-the-rahmstorf-non-linear-trend/

    ReplyDelete
  43. At RC, Martin Vermeer shares my view on the impact of 1880-1909:

    http://www.realclimate.org/index.php/archives/2011/10/the-moscow-warming-hole/comment-page-1/#comment-217625

    ReplyDelete
  44. Is it possible they are expecting you to do these calculations (or perhaps it's not up to them to provide calculations)? Is the data all available..? Or, have you already done and know the calculations, but just want them to come along (like a lawyer cross-examination?) :)

    ReplyDelete
  45. -44-Salamano

    The answer is to the questions is easily found in the paper. So yes, I know the answer to both questions,so does Martin Vermeer as linked above. It is not rocket science. It is not even climate science. Even though I know the answer, I would sure like to hear RC state the obvious nonetheless. That they won't directly answer an simple question with an obvious answer is interesting I guess.

    Bottom line -- The data prior to 1911 (or 1910) has absolutely no bearing on the analysis. It is ignored by the methodology. For the paper to suggest that they repeated the analysis using data from 1880-1909 is, to be exceedingly generous, a claim that deserves critique.

    ReplyDelete
  46. Stefan Rahmstorf flips out:

    "Faced with this kind of libelous distortion I will not answer any further questions from Pielke now or in future."

    He also tries the appeal to authority:

    "our paper was reviewed not only by two climate experts but in addition by two statistics experts coming from other fields."

    And finally:

    "If someone thinks that using a linear trend would have been preferable, that is fine with me - they should do it and publish the result in a journal."

    Well guess what? That has already been done:
    http://www.agu.org/journals/gl/gl1106/2010GL046582/

    And they found no trend from 1880. Makes one wonder why RC11 decided to adopt a methodology that makes that earlier period disappear and then flip out when pressed on that methodological choice.

    http://www.realclimate.org/index.php/archives/2011/10/the-moscow-warming-hole/comment-page-1/#comment-217595

    ReplyDelete
  47. So with this semantic war, you've said that the data before 1910 is in effect ignorable given the methodology choice, and then they retort that you've accused them of 'ignoring' the data...which, it seems you BOTH assert, actually happens anyway when applying the methodology, whether through intense scrutiny of pre-1910 data or just by tossing it in the wastebin before even looking at it.

    Sounds like tensions and ego-involvement are running high, despite both of you saying the same thing. I don't think I'd take your charges as libelous (but it's not my paper)...You ARE saying that their methodology in effect relegates the data into deep insignificance, such that it doesn't matter whether they looked at it or not, but I don't think you actually accused them of not looking at it, did you?

    ReplyDelete
  48. -47-Salamano

    "You ARE saying that their methodology in effect relegates the data into deep insignificance, such that it doesn't matter whether they looked at it or not, but I don't think you actually accused them of not looking at it, did you?"

    No, of course I did not -- and yes, "relegates the data into deep insignificance" is perfectly consistent with this post and discussion in the comments.

    It seems so obvious a point that I am surprised that it is being defended -- on second thought, no I am not.

    I will note that the RC guys are mostly letting their proxies do the debating for them ;-)

    ReplyDelete
  49. Stefan gets caught out doing a classic contrived paper to wring a desperately needed result and balks and leaves the field.
    That seems to be a very common pattern in the AGW promotion game.

    ReplyDelete
  50. Stefan gets caught out cherry picking, calls names and leaves.
    This would seem to be a very common tactic in the AGW promotion industry.

    ReplyDelete
  51. i understand your main original message to mean "be careful when interpreting this study". Well, that's certainly true.

    However, you should take a step back and consider whether the way you do it is commensurable with an "honest broker"-approach

    ReplyDelete
  52. Stefan has imposed an ex post facto hockeystick (i.e., non-linear trend) by picking his starting point. This manipulation cannot even be disguised by BCPs, ignored PCs and up-side down Finish varves. I suspect than any 100 year climate record that has just been broken is very likely to produce something that looks like a hockeystick.
    Diagnosis: Extreme confirmation bias
    Prescription: When you find your self in a hole, stop digging

    ReplyDelete
  53. Salamano et al, (there seems a cornucopia of pseudonyms around here), As I understand it, RC11, by picking 1910-11, gains a trend, whereas by picking the full database back to 1880, no trend is observable (hence, by adding the latter analyses, if it were done, to the former, since the latter has 0 trend, no difference would be made - 1+0=1!) and they do this without any proper explanation. For Eric Steig to assert that there is a difference and yet not provide the numbers just adds to the nonsense and is par for the course with RC. And then Rahmstorf stamps off in a huff.
    Now, Salamano, what is difficult to understand about this? It seems pretty plain and simple. And yet you manage to get it completely back to front - it isnt that because 1880 'adds nothing' because it shows no trend it 's 'ignorable', it is that 1911 is cherry picked, and shows a trend, (or appears to be until an explanation is forthcoming) that it is 'ignorable'. Simples!

    ReplyDelete
  54. -53-Lewis

    Thanks for your comment, let me offer a bit of a clarification of your point, which is correct it its conclusion, but misses a few details.

    1. A linear trend from 1880 shows no increase.
    2. In this context, RC11 adopts an unconventional "non-linear trend" that ignores (for the most part, and increasingly back into time) what happened before 1980 (see update above) -- Note it is unconventional in the attribution literature, if fact I think it is sui genesis.
    3. The paper draws our attention to 1911 because, as SR explains, their hypothetical Monte Carlo simulation used 100 years -- an arbitrary choice if ever there was one
    4. Since the hypothetical used 100 years, they apply that time period to the Russian data -- conveniently
    5. The paper asserts that adding in the 1880-1909 data strengthens the top line conclusion -- this is just wrong, the data itself has no effect (see Martin Vermeer's comments reproduced above -- Steig was in fact wrong in his comment, which explains why he has not reappeared with those numbers). Adding more years does change the abstract math, thus giving an impression that is simply not true. (Think about it, would adding 3 decades of warmer temperatures to the analysis really make recent extremes statistically _more_ likely? This doesn't even pass the laugh test.)

    You are correct that when asked to provide very simple and straightforward numbers RC has refused.

    Bernie in -52- is onto something -- the cherrypick in this case is so much in-your-face that it cannot be obscured by appeals to methodological complexity.

    Advice for RC next time -- further complexify your cherry picks to ensure maximum plausible deniability ;-)

    SR has accused me of libel for pointing all of this out, a very strong claim, and used that as a basis for refusing to comment on his paper in the face of my very simple questions -- I await being sued!

    ReplyDelete
  55. So, not just a cherry pick but a re-definition of the idea of a 'trend'. Climate scientists draw inappropriate linear trends through data all the time - except when it doesn't give the answer they want. The 'nonlinear trend' used here seems to be just a poorly explained smoothing of the data, from a 2-page EOS article by Moore and Grinsted, ref 22.

    It was Rahmstorf's refusal to answer a straightforward question about smoothings near endpoints that was one of the key steps in my conversion from AGW believer to skeptic.

    My favorite example is when he extended the smoothing period on a graph for the Copenhagen report to hide the recent decline in warming, but forget to mention that he had done so. (Google "source of fishy odor confirmed").

    In case Stefan wants to sue me as well as Roger:
    Paul Matthews
    School of Mathematical Sciences
    University of Nottingham, UK

    ReplyDelete
  56. Roger I believe you are incorrect in your assessment of the statistics. I don't claim to be 100% sure on it, but we can discuss it here if you want, leaving accusations of malfeasance aside.

    ReplyDelete
  57. -56-Jim Bouldin

    As my views are just my views, I could very well be incorrect, so let's chat about what you are thinking -- that is probably the best way to sort it out.

    ReplyDelete
  58. You've said numerous things but I understand your principal position to be that the temperature record for the years prior to 1911 should make no difference in the calculated frequency of expected record high temps in the most recent decade, and that that expectation is, rather, a function simply of the trend over whatever time period is analyzed. Am I correct in this interpretation?

    ReplyDelete
  59. -59-Jim Bouldin

    Yes, this is essentially it, but let me be exceedingly precise.

    The paper states:

    "Because July 2010 is by far the hottest on record, including it in the trend and variance calculation could arguably introduce an element of confirmation bias. We therefore repeated the calculation excluding this data point, using the 1910–2009 data instead, to see whether the temperature data prior to 2010 provide a reason to anticipate a new heat record. With a thus revised nonlinear trend, the expected number of heat records in the last decade reduces to 0.47, which implies a 78% probability [(0.47 − 0.105)∕0.47] that a new Moscow record is due to the warming trend. This number increases to over 80% if we repeat the analysis for the full data period in the GISS database (i.e., 1880–2009), rather than just the last 100 y, because the expected number for stationary climate then reduces from 0.105 to 0.079 according to the 1∕n law."

    As written the paper suggests that the >80% number is the result of this calculation:

    =(0.47-0.079)/0.47 = 83.2% which is >80%

    As you have noted elsewhere, the 0.47 value for the expected number of heat records in the past decade, could be as low as 0.40 and the probability would still be >80%.

    My assertion is that the non-linear trend calculated over 1910-2009 used to calculate the 0.47 and as referred to in the following passage -- "... using the 1910–2009 data instead, to see whether the temperature data prior to 2010 provide a reason to anticipate a new heat record. With a thus revised nonlinear trend ..." -- is identical to the value calculated using the 1880-2009 data.

    In other words, the addition of the data prior to 1910 make no difference in the calculation of expected heat records in the current decade -- that value will remain at 0.47.

    I have asked you guys at RC to provide any data to suggest that this is incorrect, by asking for both (a) the revised trend values when 1880-1909 is included versus when it is not, and (b) the effect on the 0.47 value of including the earlier data in the calculation. Eric suggested that there was indeed a difference in both, but no numbers have been forthcoming. I think he is mistaken.

    Now, we both know that a "non-linear trend" that implements a filter of 30 years total renders data outside of that filter irrelevant.

    So my assertion has been that the inclusion of the earlier data -- when the paper states "if we repeat the analysis for the full data period in the GISS database (i.e., 1880–2009)" -- is not really doing anything with that data, but rather just adding years to the 1/n calculation.

    This renders the pre-1910 data meaningless in this methodology, and as a colleague has pointed out, even that is not the right date for when the data becomes meaningless, it is more like 1980.

    This should be enough for you to work from, but please ask if unclear. Thanks!

    ReplyDelete
  60. Roger, you state: "As written the paper suggests that the >80% number is the result of this calculation: (0.47-0.079)/0.47 = 83.2% which is >80%
    "

    What makes you think that the paper suggests this? I do not see it suggesting that in any way. My interpretation is that you backtrack from the given figures (">80%" and 0.79[sic-it's .079, JB]), to calculate the expected number, giving about 0.4 (actually 0.42), as I did in my RC response to your question. If the authors used .079 as their value of 1/n, this means they were very likely also calculating the trend over all 130 years of the record. There is no reason from the text, that I can see, to assume they applied a 100 year trend value to the 130 year period.

    I think this is the foundation of your mistaken interpretation of the whole issue.

    ReplyDelete
  61. -61-Jim Bouldin

    You ask: "What makes you think that the paper suggests this?"

    Because the paper explains the calculation as follows: "This number increases to over 80% if we repeat the analysis for the full data period in the GISS database (i.e., 1880–2009), rather than just the last 100 y, because the expected number for stationary climate then reduces from 0.105 to 0.079 according to the 1∕n law."

    It does not say anything about how the "revised non-linear trend" changes. If it did change I am assuming that they would have informed their readers.

    The new PS at RC would seem to confirm this:

    "The main effect of a longer timeseries is that the expected number of recent new records in a stationary climate gets smaller. So the ratio of extremes expected with climate change to that without climate change gets larger if we have more data. I.e., the probability that the 2010 record was due to climate change is a bit larger if it was a 130-year heat record than if it was only a 100-year heat record."

    I take this to mean that the data prior to 1910 does not matter -- not for the trend nor (certainly) for the expected record breaking.

    You write: "There is no reason from the text, that I can see, to assume they applied a 100 year trend value to the 130 year period."

    I think that the answer to this is that there is no such thing as a 100-year or a 130-year trend value, because of the use of the "non-linear trend" here.

    The paper is misleading when it says, "We therefore repeated the calculation excluding this data point, using the 1910–2009 data instead, to see whether the temperature data prior to 2010 provide a reason to anticipate a new heat record. With a thus revised nonlinear
    trend . . . "

    1910 is meaningless here. Substitute 1880 or 1930 or 1950 and I don't think that the value of the "thus revised non-linear trend" changes one bit.

    ReplyDelete
  62. Jim-

    Another way to present this misdirection is to consider this comment from the RC post:

    "We conclude that the 2010 Moscow heat record is, with 80% probability, due to the long-term climatic warming trend."

    Look at the figure -- the "long-term" trend (illustrated by the non-linear dashed line) is actually from a period that begins after 1980. In discussions of climate this is not "long-term" but actually about as short-term as is possible and still be talking about climate (<31 years).

    The data from 1880 show no "long-term" warming.

    So you can talk about warming, but it is short-term.

    You can also talk about the long-term, but there is no warming.

    What is not correct is to talk about long-term warming.

    ReplyDelete
  63. Just because they didn't explicitly say anything about a change in the nonlinear trend doesn't mean there wasn't one. And the postscript, as you correctly quote, starts: "The main effect..." Not the *only* effect, but the *main* effect. A small drop in expected frequency of a record in the last decade from 0.47 to 0.42 and a larger drop in the 1/n values is thoroughly consistent with Stefan's postscript statement. That drop from 0.47 to 0.42 seems likely to me to be due to the occassional record heat year that would have occurred between 1880 and 1910.

    ReplyDelete
  64. "I think that the answer to this is that there is no such thing as a 100-year or a 130-year trend value, because of the use of the "non-linear trend" here."

    That doesn't make any sense Roger, at all. Just because it's not linear doesn't mean it's not a trend. Also, you have been asking here and at RC what the value of the 130 year trend is. I mean come on.

    ReplyDelete
  65. -65-Jim Bouldin

    "Just because they didn't explicitly say anything about a change in the nonlinear trend doesn't mean there wasn't one."

    Sure. Which is precisely why I asked this question at RC:
    http://www.realclimate.org/?comments_popup=9247#comment-217623

    While I welcome your reply, it'd be nice to have the actual numbers and their application from the paper itself. My question remains unanswered.

    ReplyDelete
  66. "Another way to present this misdirection..."

    Roger, I said in #56 that I'd discuss the statistical topic with you, in the absence of accusations of malfeasance, did I not?

    ReplyDelete
  67. -66-Jim Bouldin

    The easy way to resolve this is to provide the actual numbers and math showing how the non-linear trend changes from 1910-2009 to 1880-2009.

    When I wrote -- "I think that the answer to this is that there is no such thing as a 100-year or a 130-year trend value, because of the use of the "non-linear trend" here."

    ... it would have been more precise to have written ...

    "I think that the answer to this is that there is no such thing as a DISTINCT 100-year or a 130-year trend value, because of the use of the "non-linear trend" here."

    My point is that the action in the wiggly trend line is all at the recent end, the period since 1980, and no matter how you deal with the early wiggles, it makes no difference at the end.

    -63- says the same thing, but perhaps more clearly.

    Thanks

    ReplyDelete
  68. "Sure. Which is precisely why I asked this question at RC..."

    What do you mean "sure"? You just immediately above said that there was no such thing as a 100 or 130 year nonlinear trend.

    Furthermore, your accusations that your questions have gone unanswered at RC is glaringly wrong, as both Eric and I gave answers to your questions, and I specifically gave the computation for the number of expected recent records. I mean this is a joke that you would make such a claim.

    ReplyDelete
  69. -69-Jim Bouldin

    Slow down a bit. I have clarified that point in -68-.

    On the second point, I can do backwards math as easy as anyone, and you are just speculating in response to one part of my question. Please don't try to pass that off as authoritative.

    But in case you do know the answer, please do describe _mathematically_ and _specifically_ how the addition of data from 1880-1909 changes the non-linear trend and leads to a reduction in the expected frequency of heat records over the past decade from 0.47 to 0.42.

    Thanks!

    ReplyDelete
  70. -68-Jim Bouldin

    "Roger, I said in #56 that I'd discuss the statistical topic with you, in the absence of accusations of malfeasance, did I not?"

    You can step off that high horse. You are going to be treated with the utmost respect and collegiality here. That of course does not mean that you will be treated like royalty.

    But I apologize if that comment caused you offense on behalf of your friends. Let me rephrase -- the paper implies something that is not the case. Once again:

    The data from 1880 show no "long-term" warming.

    So you can talk about warming, but it is short-term.

    You can also talk about the long-term, but there is no warming.

    What is not correct is to talk about long-term warming.

    ReplyDelete
  71. ""Roger, I said in #56 that I'd discuss the statistical topic with you, in the absence of accusations of malfeasance, did I not?""

    "You can step off that high horse. You are going to be treated with the utmost respect and collegiality here. That of course does not mean that you will be treated like royalty."

    Did I say anything about myself in that statement about accusations Roger? The answer is that no I did not. You read into it that I did, just like you read into the paper that they used the same trend for the two time periods. I don't care who you direct your accusations toward, I'm not going to discuss the matter with you if you do it.

    ReplyDelete
  72. -73-Jim Bouldin

    "Did I say anything about myself in that statement about accusations Roger?"

    Um, yes. You use the word "I" three times. Presumably it is you taking offense at my "accusations," right?

    But rather than arguing about how offended your sensibilities are by my use of the word "misdirection" (and coming from a host of Real Climate that degree of thin-skinnedness is rich, but I digress;-), before you storm off in a huff, do you or don't you have an answer for the following question?

    Can you describe _mathematically_ and _specifically_ how the addition of data from 1880-1909 changes the non-linear trend and leads to a reduction in the expected frequency of heat records over the past decade from 0.47 to 0.42?

    If you wish to discuss numbers and math, then please do so.

    Thanks!

    ReplyDelete
  73. Me: "Did I say anything about myself in that statement about accusations Roger?"

    Roger: "Um, yes. You use the word "I" three times. Presumably it is you taking offense at my "accusations," right?"

    No, Roger, that's not going to work OK? Maybe on some others that you're used to throwing off the trail with you constantly shifting arguments, but not me.

    The "myself" to which I was referring there was with reference to your stated assumption (#72) that I was complaining of being accused of misdirection, which I was *NOT*, and so once again you have made your own interpretation, intentionally or unintentionally mistaken, which you want to pass off.

    You are clearly deep into your game playing now. Your questions have already been answered but you're not getting it. You accuse others of subjective interpretations while completely blind to your own. If I get time I'll try to lay it out for you, but right now, frankly the World Series takes precedence over this waste of time.

    ReplyDelete
  74. -75-Jim Bouldin

    I have no idea what you are talking about.

    I never accused you of misdirection or interpreted your complaint as being about you. My statement about misdirection was in reference to the paper that we are discussing (not you), and I take it you thought that characterization unfair. That is why I wrote, "I apologize if that comment caused you offense on behalf of your friends." _Your friends_, get it?

    Anyway this is exactly the sort of nonsense that blog discussions are perfect for but which are a waste of time.

    Anyway, if you can explain in a reply the answer to this question, that'd be great, because if you have already done so, I've missed it:

    Can you describe _mathematically_ and _specifically_ how the addition of data from 1880-1909 (a) changes the non-linear trend versus that starting in 1910 and (b) leads to a reduction in the expected frequency of heat records over the past decade from 0.47 to 0.42?

    Thanks ...

    Meantime, Go Cards!

    ReplyDelete
  75. "I fear that you may not realise that 'Joshua' and 'chriscolose' are well-known trolls (noted at other blogs - possibly even members of the [Warmista] Rapid Reaction Team)."

    Yes, it's true. I've been found out. I am a member of the RRT - and handsomely paid for my efforts I might add.

    ReplyDelete
  76. Figures you'd be a Cards fan.

    Seeing as how the Cards knocked out the Phils, I'm rooting for Texas.

    Looks like a good choice after those two bombs this inning.

    ReplyDelete
  77. Guys guys guys!!!

    I would just like to add my completely unsolicited and unqualified comment that in the context of this hotly contested debate, upon my most scrupulous and detailed analysis I have to agree with Roger: Go Cards!

    And if Michael Tobis just happens to be around: Booooo Texas!

    ReplyDelete
  78. -78-Joshua

    Cards fan? No, Rockies fan. But NL all the way. Pitchers should hit. DH? Sissy. But congrats to Texas, looks like they are going to win. I always have liked Nolan Ryan.

    Also, I deleted your last comment. If it is important, feel free to resubmit over at the deleted comments thread. This thread will not be about interpretations of Jim Bouldin's rules of engagement, and I believe that is now done.

    Sorry, but that is how it is;-)

    ReplyDelete
  79. Last comment on this - don't post as you see fit.

    You post this:

    " Ir'Rational said... 64

    -54- 5. says it all."

    But you don't post my comment? What could possibly be the consistent criterion? Ir'Rational made his observations about the back-and-forth, as did I. His observation gets posted, and mine doesn't? Because, specifically, my observation was that you breached the agreement?

    Poor form, Roger.

    ReplyDelete
  80. -81-Joshua

    Not sure what your complaint is ... -54-5 was written by me and reads:

    "The paper asserts that adding in the 1880-1909 data strengthens the top line conclusion -- this is just wrong, the data itself has no effect (see Martin Vermeer's comments reproduced above -- Steig was in fact wrong in his comment, which explains why he has not reappeared with those numbers). Adding more years does change the abstract math, thus giving an impression that is simply not true. (Think about it, would adding 3 decades of warmer temperatures to the analysis really make recent extremes statistically _more_ likely? This doesn't even pass the laugh test.)"

    It has nothing to do with Jim Bouldin (who arrives at 56, which is after 54). Ir'Rational was not commenting on the Bouldin exchange. All such comments will be disallowed, and you are not a special case ;-)

    7-5 in the 9th.

    ReplyDelete
  81. -81- Roger.

    You have the bully-pulpit. You make the rules.

    I offered an observation on the back and forth between you and those you were disputing with. As did others. I see no reason why, because I commented on the aspect of whether you remained true to an agreement, my observation should be deleted. It leaves the impression that for some reason, you are particularly sensitive on that issue, and for that reason only deleted my comment.

    The distinction you made, other than that, seems completely arbitrary.

    But you make the rules.

    Can't believe Cruz missed that. Snatching defeat out of the jaws of victory.

    Probably the worst defensive display (combined from both teams) in the history of potential World Series clinching games?

    ReplyDelete
  82. -83-Joshua

    1. You have no idea how many comments have been deleted on this thread. You only know how many people I have explained my decision to -- consider that.

    2. I do not want this thread to turn into a debate over Jim Bouldin's suggestions for my blog moderation policy -- surely you can understand.

    3. If you wish, all comments submitted to the deleted comments thread get published. Go for it if it matters that much.

    STL is cooked. In better news the Rapids beat Columbus 1-0 at a frigid playoff game tonite;-)

    ReplyDelete
  83. Not a big fan of TLR. I like seeing him stuck in a situation where, because of all his moves, he has to pinch-hit with a pitcher in a WS game.

    ReplyDelete
  84. It doesn't matter that much.

    Just seemed arbitrary, that's all. I can understand not wanting the discussion to drag on endlessly.

    Seems like the death pronouncement on St. Louis may have been premature. (Kind of like all those "final nails" in AGW's coffin).

    ReplyDelete
  85. Not sure anyone else in baseball gets the IBB there (given that a hot hitting lefty of Berkman's quality was waiting on deck).

    ReplyDelete
  86. I don't particularly mind the DH, but that game was a good an argument as any against it. That game would have likely been less interesting with a DH.

    ReplyDelete
  87. Well, that was an extremely exciting game, however not a particularly well-played one. Although a fan of an AL team, I believe that the DH is an abomination against the nature of the game. And makes it way too easy on the manager.

    ...Anyway, I arrived late here, but can I take it that whatever the statistical merit of the paper's method, (1) the description is apparently inadequate for its replication, as two persons disagree on its evaluation on an agreed data set [0.47 vs. 0.42]; and (2) with all the back and forth on the subject, nobody has offered code with which to settle the matter.

    ReplyDelete
  88. One of the things I'm happy with, is that the principle scientists are willing to engage each other on the matter, instead of the ivory-tower 'see you in peer review' nonsense. It's important, not in an equivalence kind of way, but in a way that allows consumers to see scientific review happen, rather than just the published product.

    Imagine if we only saw the majority statement from the Supreme Court, and never the dissents? And yet, some folks want to deligitimize or not even acknowledge disagreements or questions that arise. I would like to follow this conversation to the end, but it sounds like two lawyers who know what the other is talking about, but don't want to give the other points :) Nevertheless...I hope a member of the Rapid Reaction Team will step in where others fall away until someone gives the answer (or repeats it again but with more clarity for folks reading along). Or perhaps if we wait long enough, Steve McIntyre or someone will choose to "audit" it. Rahmsdorf has come across his desk before.

    ReplyDelete
  89. Thanks for the nuances, Roger - I'm not a statistician nor a climate scientist nor a writer of 'just so' stories of 'attribution' so I can only pay attention to the logic of the arguments (Note I never specified between 'linear' or 'non-linear' though I imagine any short term trend, whether negative or positive, must be 'none-linear). And what I find is what appears to be a wilful illogicality - hence, the exhausting and ultimately unfruitful engagement with RC
    (#56 passim - answer the question won't you?). It's a loosing wicket - metaphor from a real game, not that silly NFL!

    ReplyDelete
  90. -53- Lewis

    If you're still reading....

    Your explanation seems clear to me, but I'm hoping that you could clarify this one aspect:


    "and they do this without any proper explanation."


    Did they explain what they did in the paper itself?

    You said they didn't provide a "proper" explanation. "Proper," it seems to me, could, and I emphasize "could," be a subjective determination.

    Is it that they offered no explanation, or that some readers have determined that the explanation is statistically invalid?

    ReplyDelete
  91. -94-Joshua

    It is not "proper" if one wants to see the math behind the computation of the 0.47 number. The 0.42 number that Jim Bouldin refers to does not even appear in the paper (and I am skeptical that it is even correct as he alleges). Bouldin had time to leave a half dozen or so posts here and raise a stink about the tone of comments, but didn't apparently have time to provide the simple math that I asked for ... why is that so hard?

    The code would be nice, but really it is nothing more complicated than a spreadsheet (if that).

    ReplyDelete
  92. -95-Roger,

    Like Lewis, I am neither a statistician nor a climate scientist, and I can add that I'm not particularly knowledgeable mathematically (and not terribly intelligent).

    As such, I need things spelled out for me very simply, so bear with me.

    It doesn't seem to me that you answered my question directly. I asked the question of Lewis because I found his earlier explanation very clear - but I welcome your explanation as well.

    I'm still not clear (apologies).

    Are saying that they offered an explanation but didn't "show their math," i.e., a spreadsheet and/or code?

    That seems qualitatively different, in terms of integrity, than not describing their process in their paper (ideally in the methodology section).

    Is it typical in such papers to "show the math" or provide code?

    In my experience, when I read papers (in different branches of academics) where statistical analyses are conducted, the methodology is described but the actual math is not provided. Typically, I see that "XYZ analysis was conducted with ABC statistical methodology."

    Not to say that the quality of science wouldn't be enhanced if math were always shown or code always provided.

    But I'm trying to understand the precise nature of this situation. In my book, describing the process without showing the math does not equal "cherry-picking." Obviously, a determination of "cherry-picking" is not always black and white. There may be some ambiguity there in that describing a process incompletely, or a complete description of a statistically invalid process (assuming that the author is aware that the process is invalid) could constitute "cherry-picking" as well.

    But I think that all of the contingencies and nuances need to be fully explained before an accusation of "cherry-picking" is leveled. Perhaps you have fully explained all the nuances and my technical and/or intellectual shortcomings are the reason for my confusion - which is why I'm asking for clarification.

    As to your question of "Why is that so hard?" --

    It doesn't seem to me that it should be (assuming I fully understand what you're asking for) - so there are two possible explanations, as I see it. One is that the authors are trying to hide a breach of scientific ethics. The other is that your approach does not engender an open discussion about complicated and nuanced issues (as to the legacy behind your approach, if in fact it doesn't engender openness, I am not in a position to judge).

    Again, as an observer with limited resources to work with, I am not able to decipher which answer is more accurate, if the answer is some combination of those two possibilities, or if there is a completely different explanation.

    My past experiences is that the "point-scoring" that I see characterizing debates about climate science are an over-riding force - and it makes it impossible for me to reach definitive conclusions one way or the other, oftentimes.

    ReplyDelete
  93. -96-Joshua

    Thanks, I am happy to answer specific questions. Cherry picking is not a breach of scientific ethics (as I have argued to some considerable opposition from some readers of this blog). But it not not good practice, I will say that.

    Let me proceed step by step, and it may take a few replies but let's start with this:

    A simple question to ask the authors is why their study uses 1910 as a starting date?

    The answer they gave was that the Monte Carlo simulation they used had 100 years, and 1910-2009 is 100 years.

    OK, then why arbitrarily select 100 years? Not, say 130?

    Well, one answer is that 100 years gives a linear warming trend and 130 does not.

    From the paper:

    ACTUAL: "Next we apply the analysis to the mean July temperatures at Moscow weather station (Fig. 1E), for which the linear trend over the past 100 y is 1.8 °C and the interannual variability is 1.7 °C. Their ratio of 0.011∕y yields an expected 0.29 heat records in the last decade, compared to 0.105 in a stationary climate, giving a 64% probability [ð0.29 − 0.105Þ∕0.29] that a heat record is because of the warming trend."

    Here is how that passage would be re-written if the authors had used 130 years:

    REVISED: "Next we apply the analysis to the mean July temperatures at Moscow weather station (Fig. 1E), for which the linear trend over the past 130 y is 0.0 °C and the interannual variability is 1.7 °C. Their ratio of 0.0∕y yields an expected 0.105 heat records in the last decade, compared to 0.105 in a stationary climate, giving a zero probability that a heat record is because of the warming trend."

    [NOTE: 3 of the 4 official temperature records are actually negative 1880-2009 linear trends, I use 0.0 for illustrative purposes]

    Now you tell me -- does the difference between ACTUAL and REVISED seem important? It sure does to me ... and then the analysis follows from there, eventually leading to the claim, that I have focused on in the discussions above, that adding in 1880-1909 strengthens (!) the results of the analysis. Given what is above, this is an interesting claim no doubt -- I have characterized it as "misdirection".

    But enough for one reply -- Questions about the above?

    ReplyDelete
  94. Roger - thanks (and thanks for your patience):

    Too much technical even there for me to respond to quickly (really!). I have to look at it more carefully and I don't have time right now. I'll look at it later.

    As a relatively quick response on the non-technical aspects, my opinion is that "cherry-pick" implies a selection of a subset of data without a specification of the criterion for the selection - with the specific intent of serving an agenda that would be undermined if all of the data were used (or an explanation were provided). I consider that to be a breach of scientific ethics (particularly if done intentionally - a very tough standard to prove. Is it possible to "unintentionally" cherry-pick? I don't think so).

    I don't know that the common or "official" definition is there, or even if there is an established definition, but you should know that when you say someone has "cherry-picked," that is my working definition of the term. I'm a "descriptivist" WRT language usage - so ultimately I'm not sure whether, even if there is an "official" meaning for "cherry-pick," it really matters. If my view is a minority opinion, then it probably should be disregarded; but you should know that without a definition provided as to how you interpret the connotations of the term (for someone who can't interpret the technical nuances of the debate on their own to understand in depth how it is being applied) at least one reader assumed you were making an accusation of a breach of scientific ethics. That said, I will be advised if I see you use the term again in the future.

    ReplyDelete
  95. -98-Joshua

    I got grief from the so-called skeptics for defending the hockey stick guys cherry picks:

    http://rogerpielkejr.blogspot.com/2010/05/picking-cherries-and-hot-fudge.html

    Choices must be made in doing research, that cannot be avoided. If you make choices in such a way as to favor certain results, you should expect to be called on it. I won't call Rahmstorf's decisions unethical, but I have characterized his cherry pick in strong terms even so.

    If you (or anyone else) wants to follow up what I started in -97- above, I am happy to do so. This is a neat case of cherry picking;-)

    ReplyDelete
  96. Geeeze ... Roger asks a simple question ... Colose and Bouldin step up to bat a few hit 'n runs, while a known troll runs interference ... and after 99 comments, there's still no answer from the RC bench to Roger's question.

    Although there have been a few goals scored in some game or other along the way!

    Talk about "climate science as contact sport", eh?!

    ReplyDelete
  97. hro001-#100,

    Science in general isn't a contact sport. But it is a blood sport.

    ReplyDelete
  98. Just caught up with the comments in this thread and wanted to say "Bravo!" Roger for both your analysis and your attitude towards the other commentators despite being sorely provoked.

    ReplyDelete
  99. -98- Roger

    Nope.

    Just wanted to give it some time before coming back (including taking time to look at the thread you linked).

    I appreciate you taking the time to engage, and wouldn't show disrespect by failing to respond.

    ReplyDelete
  100. re 97

    Has the specfic question about the hypothetical revision been presented to the authors on RealClimate?

    ReplyDelete
  101. -105-djvjbsl

    I haven't asked them about it, not sure what I'd ask as this is utterly obvious and basically indisputable.

    Maybe the question would be, who are you trying to kid, do you really think we are that stupid? ;-)

    ReplyDelete
  102. -97 - Roger,

    I've written a response, but it's even more rambling than what I usually write.

    I want to sit on it a bit more and see if I can make it more concise (no promises in that regard, you may just get a particularly rambling response if that's the best I can do).

    Eagles/Cowboys is on. Gotta go.

    ReplyDelete
  103. -107-Joshua

    Thanks, but no need to ramble, -97- above should be straightforward.

    The game is a blowout ;-)

    ReplyDelete
  104. Roger -

    Still can't figure out how to write something particularly coherent, but I'm sick of thinking about it so I'll just have to post what I've got.


    "Now you tell me -- does the difference between ACTUAL and REVISED seem important?"

    From the way you lay the question out, I would obviously have to say yes, it **seems** important. If you use a subset of data that shows a significantly different warming trend than the trend shown by the full set of data, then it would seem likely that the revision would be showing a non-trivial difference.

    However, I'm afraid that I don't have the background in statistics to answer that question in a truly meaningful sense in the full context. As near as I can tell, there is a difference of opinion here that is contingent on a different view regarding valid statistical methodology, what comprises valid analysis, and what comprises an *important* difference.

    I can think of examples where the difference such as you described would essentially be trivial. Consider a situation where the year between 1880 and 1910 were highly variable, in a way that seemed anomalous; perhaps the temperatures during those years were attributable to some unusual events such as a volcano explosion, a very unusually active period of solar radiation, or both. In that case, as long as an explanation for the selection of a subset of data was explained, it could conceivably lead to a more useful analysis of trends than including the additional 30 years in the calculations. In that kind of situation, the difference would not be "important." I’m not saying that such an explanation was provided in this case – but it does seem to me that there is an explanation (that I don’t understand) provided that explains why the trend analysis excluding the additional 30 years was statistically valid.

    What does seem odd to me is to perform a trend analysis on the shorter data subset and then extend that analysis backwards (as near as I can tell, that’s what they did). But it does seem that they provided an explanation and I don’t feel that my background knowledge enables me to evaluate that explanation.

    I suspect that you'll find this response unsatisfactory. Perhaps you'll think that I'm somehow ducking the issue? All I can tell you is that I'm not. Not understanding the technical aspects of the debate, equivocating is the best I can do.

    ReplyDelete
  105. That all said, I will add two posts more specific to the question of “cherry-picking.”

    Lewis’ explanation seems clear to me - as someone who cannot follow the methodological argument. The way he lays it out, based on my ability to understand the technical aspects, it looks like "cherry-picking." But a conclusion for me is contingent on a few factors: As I define the term (intentional breach of scientific ethics) a determination of "cherry-picking" requires answers to two questions related to his description.

    1) Arbitrariness. He states that the choice of 100 years was "arbitrary." Well, I have two questions WRT “arbitray”: The first would be whether the selection of a subset of a larger dataset was truly arbitrary. “Arbitrary” would mean to me that it was *not* driven by an attempt to produce a particular, desired outcome. In that case, IMO, an arbitrary selection of data would not constitute "cherry-picking." (The term arbitrary is sometimes ambiguous - it can mean either with no specific reason or it could mean for purely subjective reasons). Would the authors would disagree with the characterization of "arbitrary," and accordingly, could they provide an explanation of why it was not arbitrary? Such an explanation would require agreement from at least some % of statistical experts who can reasonably be considered impartial. Or, the authors could argue that the choice wasn’t arbitrary by citing texts on statistical practices that would describe their methodology referring to a completely unrelated, or generic, context.

    2) As I asked in my response to Lewis - was an explanation for the authors' choices provided? In my book, if an explanation was provided then it might be a case of poor science, or it might be a case of controversial science, but it is not a case of "cherry-picking" (unless the authors knew that the explanation was invalid). Disagreeing with the rationality or validity an **explained** choice does not justify an accusation of "cherry-picking" in my book.

    ReplyDelete
  106. Sorry that I'm not technically knowledgeable enough to give you a more informed debate....

    But I would like to ask you some somewhat tangentially related questions about the link that you posted above. The link was instructive, but it raises some questions for me. The first question is rather simple.


    As near as I can tell, you are saying that "cherry-picking," to one extent or another, is virtually inevitable in any academic analysis - and that it is different, in a qualitative sense, than faking data. Is that right?

    The second question is more complicated:

    Do you think that there is a difference between (1) hiding data with the specific intent of creating an outcome that you know is false (in order to justify a conclusion that you know is invalid) and, (2) selectively presenting some data that you feel support what you consider to be a *valid thesis* while not presenting some potentially conflicting data that you're aware of but do not believe actually invalidate the thesis?

    This relates to my viewpoint that ultimately, a determination of "cherry-picking" requires a determination of intent on the part of the accused (something that requires hard evidence to prove). If an author fails to deal effectively with a counter-argument (but doesn't believe that the counter-argument invalidates his/her thesis), it does not necessarily imply "cherry-picking." It may simply imply that the analysis is poor (which, of course, suggests that the author is a poor academic - somewhat mitigated by his/her overall body of work). If an author knowingly selects a subset of a larger pool of data for the specific purpose of supporting a thesis *that he/she knows is invalidated by the larger set of data* then in my book that is "cherry-picking," which is one form of academic fraud - and all together a different beast.

    ReplyDelete
  107. Joshua:

    Almost nobody commits fraud knowing their hypothesis is invalid. Your scenario (1) is almost unheard of. Rather, people commit fraud by suppressing contrary evidence. In science, that is considered unethical.

    ReplyDelete
  108. - 112 - Gerard:

    "Almost nobody commits fraud knowing their hypothesis is invalid."

    Probably. But there is no doubt that accusations of precisely that kind of fraud, in the climate debate context, are ubiquitous.

    That is why I take much of what I read in the "skeptical" blogosphere with a huge grain of salt. Much of it is based on finding widespread occurrences of a phenomenon that is likely to be, as you say, rare.

    ReplyDelete