20 February 2012

A Case Study in (how not to do) Quantitative Policy Analysis

Writing in the Boulder Daily Camera yesterday Tom Rohrer identifies a howler of a mistake in a recent report by the City of Boulder on "Safe Streets." The city has been implementing "flashing crosswalks" around town which light up to let motorists know that a person or bike is about to cross the street. In theory, the signals are supposed to lead the drivers to stop the pedestrians or bikers to cross, and everyone to go on their way.  In practice, the "flashing crosswalks" have been the location of some pretty nasty car-pedestrian accidents.

So it is a surprise that the city has issued a report that claims that the "flashing crosswalks" are safer than non-flashing crosswalks. Rohrer explains where the city's analysis is flawed:
[T]he city recently released their Safe Streets Boulder Report, in which they quite correctly note that a very large number of the accidents between motor vehicles and either a pedestrian or a bicyclist occurred in crosswalks at intersections, and a rather smaller number occurred in the flashing crosswalks. Moreover, I don't doubt their arithmetic was correct when they converted those accident counts to percentages.

On the face of it, the report's finding that 37 percent of the pedestrian/bicyclist-vehicle accidents were in crosswalks at intersections while "only" 6 percent of such accidents were in flashing crosswalks seems to be evidence that the flashing crosswalks are statistically more safe than anyone thought, and that if anything we should be worried about the crosswalks at our city's intersections.

Unfortunately those numbers are profoundly misleading.

You see, the report doesn't mention that there are only a few flashing crosswalks in the entire city (only 18 have ever been installed). However, there is a very large number of the crosswalks at intersections -- four at almost every intersection, totaling hundreds if not thousands across the city. So it is not terribly surprising that the city found a smaller number of accidents in a very small number of flashing crosswalks, while there were a larger number of accidents in the much larger number of crosswalks at intersections.

To correctly compare the safety of flashing crosswalks with those at intersections, one needs to know the accident rates for each type of crosswalk.
Under a more appropriate methodology the proper conclusion is that the "flashing crosswalks" have a much higher accident rate than non-flashing crosswalks -- the exact opposite of the City's conclusion. Thus, a poor quantitative analysis can do more to mislead than to clarify.  This clear and simple example will be part of my graduate seminar quantitative methods of  policy analysis the next time I teach it.


  1. "Under a more appropriate methodology the proper conclusion is that the "flashing crosswalks" have a much higher accident rate than non-flashing crosswalks"

    Not to defend the analysis that was conducted..

    But I would hope that in your seminar you include discussion of how, to really form some conclusion, you'd need to build in analysis of what might happen once people adjust to the new methodology and then feed those results back into your projections.

  2. -1-Joshua

    Thanks, yes, there are lots of questions raised. To test the impact of the flashing crosswalks one would really like to have time series data of before and after the intervention to assess changes in safety statistics. Answering the question, "Did the policy intervention work?" is often among the trickiest.

  3. Roger -

    You may also want to show this video for your students' consideration.

    And again, not to defend the analysis that was conducted, I would also suggest that you talk about the need to control for the variables of the different intersections where the new technologies were installed.

    As far as I can tell, the analysis of Tom Rohrer doesn't control for the possibility that the crosswalks where the technology was installed might be particularly high traffic crosswalks with higher rates of accidents both before and after the installation of the new technology.

    Seems like the relevant metric would be the accident rates at those particular crosswalks before and after the new technology was installed - and it seems that the analysis you highlighted may be drawing false conclusions w/r/t that metric.

    We are not benefited by criticism of poor analysis that is based on poor analysis.

  4. Actually, on second thought, your statement here:

    "To test the impact of the flashing crosswalks one would really like to have time series data of before and after the intervention to assess changes in safety statistics"

    Is completely consistent with what I just wrote. Apologies.

  5. I agree with your comment #2. I would suspect that those flashing ones were either installed at the crosswalks with most people crossing, or with more past problems, or both.
    so it would be best to know:
    1. the number of people and accidents at each crosswalk before flashing.

    Then, following installation of the flashers, 2. the number of people crossing and accidents following installation of the flashing crosswalks.

    Lots of times the most useful data would be hard or impossible to get post-hoc so that we have to make a series of assumptions, which in controversial policies are prone to disagreements about which assumptions should be made.

    The best way in my opinion is to manage policy experiments as experiments, get people on different sides to agree what they would consider as good data, and then collect data prior accordingly.

    While this may be too diffult for some issues, like climate change, it may be possible for simpler policy experiments such as crosswalks.

  6. Sorry - I guess I forgot the video link?


  7. -54-Shaorn F.

    Thanks, and agreed. What is very interesting about this seemingly simple case is the need to consciously think about data collection in parallel to the implementation. Designing implementation along the lines of experimental design makes really good sense, in those cases where it is possible, but would require some thought.

    Makes for a great class case ;-)

  8. Continuing on with the idea of getting together in advance and determining what would be acceptable evidence, we did that once..

    I was working with Pew Agbiotech and we had a series of meetings with stakeholders where we developed research questions that should be answered before/during deployment of genetically engineered trees. At the end of the day, there is a tendency for researchers to want infinite research, but we found that there probably is a zone of agreement as to what is reasonable.
    I'm not sure anyone carried forward with this work. One reason being there was no direct connection between that input and any funded federal research program.

  9. Roger,

    In response to your comment #7, in natural resource world, there is an entire body of literature on an idea called "adaptive management."
    This idea can be defined broadly,from a simple management systems "plan do check act" to "each intervention should be a designed experiment." Different disciplines and practitioners have their own definitions with the obvious self-interest; researchers tend to want more designed experiments, and managers tend to want to not get as complex or expensive.

    Nevertheless, students from outside "natural resource world" might learn something from this literature and potentially be able to apply it to other kinds of policy interventions.

    PS. I miss at least one of the tests to prove I'm not a robot each time; they seem too difficult (n's r's and i's in the black part). Do other commenters have helpful advice?

  10. - Roger -

    I assume this comment belongs in this thread?:

    "Designing implementation along the lines of experimental design makes really good sense, in those cases where it is possible, but would require some thought."

    Yes - that really is the key, isn't it? Thanks for that concise framework - I think it is useful.

    No doubt it is relatively easy to cull examples of poor planning. And I have no doubt that viewed as a whole, transportation planning could be substantially improved (primarily, I believe, through increased implementation of participatory planning conducted through stakeholder dialog).

    But I also think that much of the criticism I see of transportation planning, or similar governmental efforts, is based on a binary mentality tendency to think that because some initiative seem mind-numbingly stupid, we can generalize from those examples to form broad conclusions that are equally ill-considered.

    I know some people very involved in transportation planning and from what they've described to me, a huge roadblock (pun intended) that often interferes with careful analysis combined with implementation is blind resistance to anything governmental (and open analysis to the externalities related to various types of transportation planning).

    That's why I bristle a bit at this kind of, well, I would call it "cherry-picking."

    How many lives have been saved through improvements based on the implications of various technological developments? It is important to examine this kind of poor analysis in full context.

  11. -9-Sharon F.

    Yep, and check out Esther Duflo et al.s work on development:


  12. Roger:
    Isn't an equally important question, how come such a flawed analysis was (a)generated and (b)apparently accepted.

  13. Roger, you would also have to consider the type of "clientele" each crosswalk serves.

    If I recall, the most dangerous crosswalk was at University and Broadway, not surprising given how busy it is, and that most of the people using it are college students, who are notoriously inattentive and apparently unable to gauge risk/reward ratios.

    The city could hire armed guards to escort pedestrians through that crosswalk, and I'd bet it would still have the highest number of incidents, in absolute terms. Simply factoring out crosswalks leading to campus would likely yield entirely different conclusions, in my view.

  14. A similar example follows.

    Only 13% of physicians treated for drug addiction are anesthesiologists. Unfortunately only 5% of physicians are anesthesiologists (Zengerle 2008).

    Regarding captcha solving, even if you have to make two attempts for each post, that would only increase your total posting time by 1% on average (Chambers 2012). No sweat. For what you've posted here today Sharon, that would be about 40 seconds of extra time, certainly less than the amount of time I spent writing this paragraph. #costbenefit

  15. This is blatant cherry picking on the part of the City. The standard practice for traffic engineers is to calculate an accident rate before and after the intervention, flashing lights in this case.

    If Boulder is like most jurisdictions, they will find that the flashers have little to no measurable effect. I know the City traffic engineers and I suspect they are aware of this. Unfortunately politics sometimes prevents the truth from being told.

  16. -13-Rick

    Thanks, maybe that is why they are digging a pedestrian underpass there ;-)

  17. The captcha has gotten pretty illegible. Last time I had to read five before I found one that was unambiguous.

    At some point in the future, captcha will be actively selecting for robots. Humans will have given up in frustration.

    On the main topic, maybe it's just because it's Monday of the seventh week of the semester, but I don't see why they're trying to reduce pedestrian accidents in a college town. Seems like natural selection in action.

  18. - JMV -

    "If Boulder is like most jurisdictions, they will find that the flashers have little to no measurable effect"

    Do you have some evidence for this?

    "I know the City traffic engineers and I suspect they are aware of this. Unfortunately politics sometimes prevents the truth from being told. "

    Or that?

  19. Thanks for the link, Roger!

    Chris, with the help of your analysis I now feel all better.

    Gerard, you crack me up. Indeed, "humans will have given up in frustration."

    Maybe Roger is challenging us with "MENSA level" captcha. Only the super-smart can post.

  20. Joshua

    I have been a professional traffic engineer for over two decades. A few years ago I was paid a fairly large sum by the City of Denver to investigate this very issue. The problem is that most of the studies are fraught with all sorts of methodological problems beginning with the fact that we seldom actually know how many pedestrians use a crossing to begin with. On top of that flashing ped crossings are usually placed a t high profile locations, often the scene of a recent ped accident. However ped accidents are actually low frequency occurrences in the first place so, statistically you expect to see a reduction just due to reversion to the mean. Another confounding factor is that they tend to lead to increased pedestrian usage which alters your baseline. At the end of the day you either have studies that cherry pick their data to confirm a political choose, or you have ambiguous results.

    As for the city engineers, I could give you names and phone numbers but that would be inappropriate in a public forum. I have known and worked with some of them for over 20 years. The city actually has some very good engineers. Too bad some of the. Policy makers aren't of the same quality.

  21. Joshua

    If you are interested, here is a summary of some of the research

  22. For such things, there are many resources, but I am very fond of Tufte's concise "Data Analysis for Politics and Policy," which google has informed me is freely available on the web here: http://www.edwardtufte.com/tufte/dapp/

    Tufte's example in Chapter 1 of auto safety inspections and fatalities is the sort of thing the City of Boulder ought to have read.

    P.S., Duflo is awesome. In addition to the povertylab link you provide, people here may find her TED talk worthwhile: http://www.ted.com/talks/esther_duflo_social_experiments_to_fight_poverty.html

  23. - jmv - 21

    Thanks for that link. I notice that the research that is discussed in your link is specific w/r/t in-ground flashing crosswalks - but the photograph accompanying Roger's post is a different type of flashing crosswalk, and the article that Roger linked seems to be ambiguous, as does the report referred to in that article.

    I would imagine that there would be different levels of effectiveness depending on the type of "flashing crosswalk" being discussed, and so the research you linked may only be partially instructive.


  24. .

    It all depends on the purpose of the exercise. If the purpose is inform the public about the cost and benefit of alternatives, then it is an obvious failure.

    On the other hand, if the purpose is to justify the budget for a favored option it is quite good enough. Nit picking by "deniers" can be ignored. See JMV @ 15.


  25. These are called Xwalks in Canada. They are quite dangerous. People will press the button to start the flashing and them move immediately into the street. There is one particular case that happened near where I lived. A small boy pressed the Xwalk button to cross the street to a playground. He immediately ran into the street and was struck and killed. There are now fences set up there so that there is no access to the playground at that crossing.

    This is pertinent to the policy issue but I have long been of the opinion that Xwalks are very dangerous. There are fewer and fewer of them

  26. JMV, since you know some of Boulder's traffic engineers, could you mention to them that it may be a good idea to cut back the vegetation that obscures a clear view of the bike path for drivers turning right onto University from northbound Broadway? It might be a more cost effective solution than building another pedestrian underpass. Just sayin'