I heard two excellent talks last week. Bent Flyvbjerg, a professor of Planning at Aalborg University, Aalborg, Denmark , spoke on “Survival of the Unfittest: Why the Worst Megaprojects [subways, airports, bridges, tunnels] Get Built.” Why? Because of false claims. Cost estimates turn out to be much too low and benefit estimates (such as ridership) much too high. Boston’s Big Dig, for example, has already cost more than three times the original estimate. Cost estimates were too low in 90% of projects, Flyvbjerg said. The tools used to make those estimates have supposedly improved a great deal over the last few decades but their accuracy has not improved. Lovallo and Kahneman have argued that the underlying problem is “ optimism bias“; however, Flyvbjerg believes that the problem is what he now calls strategic misrepresentation — when he used the term lying people got upset. The greater the misrepresentation, the more likely the project would be approved — or rather the greater the truth the more likely the project would not be approved. That is a different kind of bias. An everyday example is me and my microwave oven. Sometimes I use my microwave oven to dry my clothes. I’ve done this dozens of times but I continue to badly underestimate how long it will take. I guess that a shirt will take 8 minutes to dry; it takes 15 minutes. I know I underestimate — but I keep doing it. This is not optimism bias. Microwaving is not unexpectedly difficult or unpredictable. The problem, I think, is the asymmetry of the effects of error. If my guess is too short, I have to put the shirt back in the microwave, which is inconvenient; if my guess is too long the shirt may burn — which corresponds to the project not being approved.
Incidentally, Flyvjberg has written a paper defending case studies and by extension self-experimentation. He quotes Hans Eysenck, who originally dismissed case studies as anecdotes: “Sometimes we simply have to keep our eyes open and look carefully at individual cases — not in the hope of proving anything but rather in the hope of learning something.” Exactly.
The other excellent talk (”Scagnostics” — scatterplot diagnostics) was by Leland Wilkinson, author of The Grammar of Graphics and developer of SYSTAT, who now works at SPSS. He described a system that classifies scatterplots. If you have twenty or thirty measures on each of several hundred people or cities or whatever, how do you make sense of it? Wilkinson’s algorithms measure such properties of a scatterplot as its texture, clumpiness, skewness, and four others I don’t remember. You use these measures to find the most interesting scatterplots. He illustrated the system with a set of baseball statistics — many measurements made on each of several hundred major-league baseball players. The scatterplot with the most outliers was stolen bases versus age. Stolen bases generally decline with age but there are many outliers. Although a vast number of statistical procedures assume normal distributions, Wilkinson’s tools revealed normality to be a kind of outlier. In the baseball dataset, only one scatterplot had both variables normally distributed: height versus weight. These tools may eventually be available with R.