20 March 2013

If I Taught a Statistics Course

Today I am guest lecturing in a graduate seminar here on Quantitative Methods of Policy Analysis, being taught by Jason Vogel. The subject of today's class is statistics. In preparing for the class I rounded up a set of books that I have found to be particularly useful and I thought I'd share them here, just in case I ever teach a stats class down the road.

These four books would be at the top of my required reading list:

S. Sigler, 2002. Statistics on the Table: A History of Statistical Concepts and Methods, Harvard University Press.

S. Senn, 2003. Dicing with Death: Chance, Risk and Health, Cambridge University Press.

W. Briggs, 2008. Breaking the Law of Averages: Real Life Probability and Statistics in Plain English, LuLu Marketplace (and here, free!).

M. Mauboussin, 2012. The Success Equation: Untangling Skill and Luck in Business, Sports, and Investing, Harvard University Press.

A few of the cases that we will discuss today will include the NCAA tournament (and Nate Silver's skill), hurricane trends (of course), and a few puzzlers from the books above. It'll be fun. The cases for exploration of statistical questions and methods are infinite of course, and run up against important questions of research design, epistemology and philosophy of science among other topics.

What other books, readings would you recommend?

18 comments:

  1. Worth pointing out that WMB makes the book you recommend here - and another - free to download:

    http://wmbriggs.com/book/

    although I'm sure he'd appreciate the price of a beer as much as the next man

    ReplyDelete
  2. -1-mrsean2k

    Thanks .. just added this link!

    ReplyDelete
  3. I'd think that Kahneman's Thinking Fast, Thinking Slow should be a prerequisite part of your discussion for understanding statistical analysis.

    ==]] “People who spend their time, and earn their living, studying a particular topic produce poorer predictions than dart-throwing monkeys who would have distributed their choices evenly over the options.” [[==



    ReplyDelete
  4. Gigerenzer, 2002. Reckoning with Risk, Allen Lane The Penguin Press.
    Where I first came across the phrase 'often wrong, but never in doubt' describing those who fail to question their certainties when really they ought to.

    ReplyDelete
  5. How to Lie With Statistics by Darrell Huff
    http://www.amazon.com/How-Lie-Statistics-Darrell-Huff/dp/0393310728

    Not for this class but for the rest of us.

    ReplyDelete
  6. I really like "How We Know What Isn't So: The Fallibility of Human Reason in Everyday Life" by Thomas Gilovich. Not a statistics text-book by any means, but points out (with great examples) how people often misunderstand or mis-use statistics or patterns.

    ReplyDelete
  7. Joshua:
    This is slightly O/T. Your quotation seems to have empirical validity. I followed Tversky and Kahnemann for years. They have done marvelous work and established the field of behavioral economics. However, I am still puzzled by the lack of attention paid by them and many other social psychologists to those who do not display these classic biases. See for example, Argyris' Inner Contradictions of Rigorous Research.

    ReplyDelete
  8. Box, Hunter and Hunter's "Statistics for Experimenters: An Introduction to Design, Data Analysis, and Model Building"
    http://www.amazon.com/Statistics-Experimenters-Introduction-Analysis-Building/dp/0471093157/ref=cm_cr_pr_sims_t

    is a reference I have returned to over the years.

    ReplyDelete
  9. The book I was taught with is

    How to Lie with Statistics by Darrell Huff

    http://www.amazon.com/How-Lie-Statistics-Darrell-Huff/dp/0393310728

    It shows you exactly how data can be manipulated to give the desired result.


    Common Errors in Statistics (and How to Avoid Them) by Phillip I. Good and James W. Hardin is also worth reading if you are a non-statistician who has to work with statistics. Those in biomedicals should be forced to read and understand both books.

    ReplyDelete
  10. bernie -

    Could you elaborate a bit?

    ReplyDelete
  11. Davis Balestracci's paper is a favorite of mine as well:
    Data 'Sanity': Statistical Thinking Applied to Everyday Data
    Abstract: This publication exposes eight common statistical "traps". They are: 1. Treating all observed variation in a time series data sequence as special causes, 2. Fitting inappropriate "trend" lines to a time series data sequence, 3. Unnecessary obsession with and incorrect application of the Normal distribution, 4. Incorrect calculation of standard deviation and "sigma" limits, 5. Misreading special cause signals on a control chart, 6. Choosing arbitrary cutoffs for "above" average and "below" average, 7. Improving processes through arbitrary numerical goals and standards, 8. Using statistical techniques on "rolling" or "moving" averages.

    Keywords: Process-oriented thinking - Time Series Data - Variance Reduction

    http://asq.org/statistics/1998/06/data-sanity-statistical-thinking-applied-to-everyday-data.html?shl=088344#rate

    I have spent many hours trying to undo (and/or modify) action plans that where put in place following evaluations of data that fell under "traps 1, 3, 4, 6 and 8."

    ReplyDelete
  12. Thanks all, these suggestions are great!

    ReplyDelete
  13. Joshua:
    This is definitely off the topic of this post- so my apologies. It essentially boils down to the fact that in the vast majority of these types of decision-making experiments not all subjects display the behavior associated with the bias. The empirical results are sufficient to show that the bias exists, but there is, to the best of my recollection, seldom much reported debriefing of the individuals who did not succumb to the experimental manipulation. Argyris' argument is that it would be a better and more ethical experimental design to both be public with the experimental subjects about any experimental manipulation and have as the research target the generation of behavior that is most effective, presumably free from the bias under consideration. Paradoxically and bizarrely ethics panels are set up to OK the manipulations of other people. Imagine doing a Milgram type experiment after you have described and explained the results of the original Milgram experiment.

    ReplyDelete
  14. kakatoa:
    Your link appears to be behind a pay wall. I found this: http://www.donaldpoland.com/documents_and_links/5-Other_Documents/Statistical_Thinking.pdf
    I am assuming it is the same article.

    ReplyDelete
  15. Dr. Pielke: I have a copy of "Dicing With Death" on my bunk on my sailboat, on which I live, currently in St Thomas. It has been there now for two years, being read and thumbed through repeatedly. Someday I hope to *really* understand it....for now, I find it amusing.

    ReplyDelete
  16. I am concerned about Dicing with Death. On page 5 the author smugly asserts that because a girl could be older or younger than her male sibling,the probability space is enlarged. She could be American or British, too. Should this enlarge the probability space again?

    BTW, despite the corny title Sam Savage's The Flaw of Averages is quite readable while presenting rather clear examples of the proper and improper use of statistics with respect to decision making.

    ReplyDelete
  17. Dr Pielke

    Many years ago I gave a series of seminars at the LSE and have maintained my interest in the subject.

    Principal components analysis was one of the techniques I presented to non-statisticians. Later this was useful in helping me understand the criticism of Dr Mann by Steve McIntyre.

    Most economic and environmental measurements are recorded as time-series. Several texts deal with analysis of time series.

    In my opinion, work by Engle and Granger is critically important in the training of people who will be doing research involving time series where there is great risk of spurious correlation. http://en.wikipedia.org/wiki/Cointegration http://en.wikipedia.org/wiki/Granger_causality

    Granger and Engle shared a Nobel for their work.

    An Israeli group concluded applied coointegration analysis to climate data, "We have shown that anthropogenic forcings do not polynomially cointegrate with global temperature and solar irradiance. Therefore, data for 1880–2007 do not support the anthropogenic interpretation of global warming during this period."

    This work was criticized by the reviewers and with some minor changes was accepted and published.

    The authors state that their conclusion does not disprove AGW, but merely shows that the data available at present does not support the theory.

    Reference: Beenstock, Reingewertz, and Paldor Polynomial cointegration tests of anthropogenic impact on global warming, Earth Syst. Dynam. Discuss.,

    URL: http://www.earth-syst-dynam.net/3/173/2012/esd-3-173-2012.html

    Earth Syst. Dynam., 3, 173-188, 2012 www.earth-syst-dynam.net/3/173/2012/
    doi:10.5194/esd-3-173-2012

    ReplyDelete