The Trouble With Rigor

This is an easy question: When writing down numbers, when is it bad to be precise? Answer: When you exceed the precision to which the numbers were measured. If a number was measured with a standard error of 5 (say), don’t record it as 150.323.

But this, apparently, is a hard question: When planning an experiment, when it is bad to be rigorous? Answer: When the effort involved is better used elsewhere. I recently came across the following description of a weekend conference for obesity researchers (December 2006, funded by National Institute of Diabetes & Digestive & Kidney Diseases):

Obesity is a serious condition that is associated with and believed to cause much morbidity, reduced quality of life, and decreased longevity. . . . Currently available treatments are only modestly efficacious and rigorously evaluating new (and in some cases existing) treatments for obesity are clearly in order. Conducting such evaluations to the highest standards and so that they are maximally informative requires an understanding of best methods for the conduct of randomized clinical trials in general and how they can be tailored to the specific needs of obesity research in particular. . . . We will offer a two-day meeting in which leading obesity researchers and methodologists convene to discuss best practices for randomized clinical trials in obesity.

Rigorously evaluating new treatments”? How about evaluating them at all? Evaluation of new treatments (such as new diets) is already so difficult that it almost never occurs; here is a conference about how to make such evaluations more difficult.

This mistake happens in other areas, too, of course. Two research psychiatrists have complained that misguided requirements for rigor have had a very bad effect on finding new treatments for bipolar disorder.

More Reason to Crazy-Spice

Spices are good for you, I blogged, because they are high in antioxidants. A new study, done in Singapore with elderly subjects, supports this conclusion. It found that curry-eaters do better than others on a mental test. The abstract:

Curcumin, from the curry spice turmeric, has been shown to possess potent antioxidant and antiinflammatory properties and to reduce ß-amyloid and plaque burden in experimental studies, but epidemiologic evidence is lacking. The authors investigated the association between usual curry consumption level and cognitive function in elderly Asians. In a population-based cohort (n = 1,010) of nondemented elderly Asian subjects aged 60-93 years in 2003, the authors compared Mini-Mental State Examination (MMSE) scores for three categories of regular curry consumption, taking into account known sociodemographic, health, and behavioral correlates of MMSE performance. Those who consumed curry “occasionally” and “often or very often” had significantly better MMSE scores than did subjects who “never or rarely” consumed curry. The authors reported tentative evidence of better cognitive performance from curry consumption in nondemented elderly Asians, which should be confirmed in future studies.

Tze-Pin Ng, Peak-Chiang Chiam, Theresa Lee, Hong-Choon Chua, Leslie Lim and Ee-Heok Kua. Curry Consumption and Cognitive Function in the Elderly. American Journal of Epidemiology 2006 164(9):898-906

Too Few Riders, Too Many Stolen Bases

I heard two excellent talks last week. Bent Flyvbjerg, a professor of Planning at Aalborg University, Aalborg, Denmark , spoke on “Survival of the Unfittest: Why the Worst Megaprojects [subways, airports, bridges, tunnels] Get Built.” Why? Because of false claims. Cost estimates turn out to be much too low and benefit estimates (such as ridership) much too high. Boston’s Big Dig, for example, has already cost more than three times the original estimate. Cost estimates were too low in 90% of projects, Flyvbjerg said. The tools used to make those estimates have supposedly improved a great deal over the last few decades but their accuracy has not improved. Lovallo and Kahneman have argued that the underlying problem is “ optimism bias“; however, Flyvbjerg believes that the problem is what he now calls strategic misrepresentation — when he used the term lying people got upset. The greater the misrepresentation, the more likely the project would be approved — or rather the greater the truth the more likely the project would not be approved. That is a different kind of bias. An everyday example is me and my microwave oven. Sometimes I use my microwave oven to dry my clothes. I’ve done this dozens of times but I continue to badly underestimate how long it will take. I guess that a shirt will take 8 minutes to dry; it takes 15 minutes. I know I underestimate — but I keep doing it. This is not optimism bias. Microwaving is not unexpectedly difficult or unpredictable. The problem, I think, is the asymmetry of the effects of error. If my guess is too short, I have to put the shirt back in the microwave, which is inconvenient; if my guess is too long the shirt may burn — which corresponds to the project not being approved.

Incidentally, Flyvjberg has written a paper defending case studies and by extension self-experimentation. He quotes Hans Eysenck, who originally dismissed case studies as anecdotes: “Sometimes we simply have to keep our eyes open and look carefully at individual cases — not in the hope of proving anything but rather in the hope of learning something.” Exactly.

The other excellent talk (”Scagnostics” — scatterplot diagnostics) was by Leland Wilkinson, author of The Grammar of Graphics and developer of SYSTAT, who now works at SPSS. He described a system that classifies scatterplots. If you have twenty or thirty measures on each of several hundred people or cities or whatever, how do you make sense of it? Wilkinson’s algorithms measure such properties of a scatterplot as its texture, clumpiness, skewness, and four others I don’t remember. You use these measures to find the most interesting scatterplots. He illustrated the system with a set of baseball statistics — many measurements made on each of several hundred major-league baseball players. The scatterplot with the most outliers was stolen bases versus age. Stolen bases generally decline with age but there are many outliers. Although a vast number of statistical procedures assume normal distributions, Wilkinson’s tools revealed normality to be a kind of outlier. In the baseball dataset, only one scatterplot had both variables normally distributed: height versus weight. These tools may eventually be available with R.

David Jenkins on the Shangri-La Diet

David Jenkins, a professor of nutrition at the University of Toronto, invented the glycemic index, probably the most important nutritional innovation of the last thirty years. The glycemic index helped me permanently lose 6 pounds (see Example 7 of this paper). While preparing her CBC piece about the Shangri-La Diet, Sarah Kapoor interviewed Jenkins. Here is a partial transcript of what he said.

The Writing Cure

I wonder how many bloggers know about this — research about the beneficial effects of journal writing. James Pennebaker, a professor of psychology at the University of Texas Austin, has done a lot of research in this area. Here is a list of studies. This article sums it up nicely: “Writing about important personal experiences in an emotional way for as little as 15 minutes over the course of three days brings about improvements in mental and physical [!] health. This finding has been replicated across age, gender, culture, social class, and personality type.”

I’m guessing this research started as a search for the crucial ingredients of psychotherapy. What happens during psychotherapy that helps people? Early research found that the therapist’s training made no detectable difference. This suggested that just telling one’s story was therapeutic. Journal writing is another step in the same direction: You tell your story without anyone listening. Next step: studying the health effects of blogging.