I have been asked to write six columns about common scientific mistakes for the journal Nutrition. This is a draft of the first. I am very interested in feedback, especially about what you don’t like.
Lesson 1. Doing something is better than doing nothing.
“You should go to the studio everyday,” a University of Michigan art professor named Richard Sears told his students. “There’s no guarantee that you’ll make something good — but if you don’t go, you’re guaranteed to make nothing.” The same is true of science. Every research plan has flaws, often big ones — but if you don’t do anything, you won’t learn anything.
I have been asked to write six columns about common scientific mistakes. The mistakes I see are mostly mistakes of omission.
A few years ago I visited a pediatrician in Stockholm. She was interested in the connection between sunlight and illness (children are much healthier in the summer) and had been considering doing a simple correlational study. When she told her colleagues about it, they said: Your study doesn’t control for X. You should a more difficult study. It was awful advice. In the end, she did nothing.
Science is all about learning from experience. It is a kind of fancy trial and error. But this modest description is not enough for some scientists, who create rules about proper behavior. Rule 1. You must do X (e.g., double-blind placebo-controlled experiments). Rule 2. You must not do Y (e.g., “uncontrolled” experiments). Such ritualistic thinking is common in scientific discussions, hurting not only the discussants — it makes them dismissive — but also those they might help. Sure, some experimental designs are better than others. It’s the overstatement, the notion that experiments in a certain group are not worth doing, that is the problem. It is likely that the forbidden experiments, whatever their flaws, are better than nothing. A group that has suffered from this way of thinking is people with bipolar disorder. Over the last thirty years, few new treatments for this problem have been developed. According to Post and Luckenbaugh (2003, p. 71), “many of us in the academic community have inadvertently participated in the limitation of a generation of research on bipolar illness . . . by demands for methodological purity or study comprehensiveness that can rarely be achieved.”
Rituals have right and wrong. Science is more practical. The statistician John Tukey wrote about ritualistic thinking among psychologists in an article called “Analyzing data: Sanctification or detective work?” (Tukey, 1969). One of his examples involved measurement typology. The philosopher of science N. R. Campbell had come up with the notion, popularized by Stevens (1946), that scales of measurement could be divided into four types: ratio, interval, ordinal, and nominal. Weight and age are ratio scales, for example; rating how hungry you are is an ordinal measure. The problem, said Tukey, were the accompanying prohibitions. Campbell said you can add two measurements (e.g., two heights) only if the scale is ratio or interval; if you are dealing with ordinal or nominal measures, you cannot. The effect of such prohibitions, said Tukey, is to make it less likely that you will learn something you could have learned. (See Velleman and Wilkinson, 1993, for more about what’s wrong with this typology.)
I fell victim to right-and-wrong thinking as a graduate student. I had started to use a new way to study timing and had collected data from ten rats. I plotted the data from each rat separately and looked at the ten graphs. I did not plot the average of the rats because I had read an article about how, with data like mine, averages can be misleading — they can show something not in any of the data being averaged. For example, if you average bimodal distributions you may get a unimodal distribution and vice-versa. After several months, however, I averaged my data anyway; I can’t remember why. Looking at the average, I immediately noticed a feature of the data (symmetry) that I hadn’t noticed when looking at each rat separately. The symmetry was important (Roberts, 1981).
A corollary is this: If someone (else) did something, they probably learned something. And you can probably learn something from what they did. For a few years, I attended a meeting called Animal Behavior Lunch where we discussed new animal behavior articles. All of the meetings consisted of graduate students talking at great length about the flaws of that week’s paper. The professors in attendance knew better but somehow we did not manage to teach this. The students seemed to have a very strong bias to criticize. Perhaps they had been told that “critical thinking” is good. They may have never been told that appreciation should come first. I suspect failure to teach graduate students to see clearly the virtues of flawed research is the beginning of the problem I discuss here: Mature researchers who don’t do this or that because they have been told not to do it (it is “flawed”) and as a result do nothing.
References
Post RM, Luckenbaugh DA.. Unique design issues in clinical trials of patients with bipolar affective disorder. J Psychiatr Res. 2003 Jan-Feb;37(1):61-73.
Roberts, S. (1981). Isolation of an internal clock. Journal of Experimental Psychology: Animal Behavior Processes, 7, 242-268.
Stevens, S.S. (1946). On the theory of scales of measurement. Science, 103, 677-680.
Tukey, J. W. (1969). Analyzing data: Sanctification or detective work. American Psychologist, 24, 83-91.
Velleman PF, Wilkinson L. Nominal, Ordinal, Interval, and Ratio Typologies Are Misleading. The American Statistician, Vol. 47, No. 1. (1993), pp. 65-72.