Something is Better Than Nothing

I have been asked to write six columns about common scientific mistakes for the journal Nutrition. This is a draft of the first. I am very interested in feedback, especially about what you don’t like.

Lesson 1. Doing something is better than doing nothing.

“You should go to the studio everyday,” a University of Michigan art professor named Richard Sears told his students. “There’s no guarantee that you’ll make something good — but if you don’t go, you’re guaranteed to make nothing.” The same is true of science. Every research plan has flaws, often big ones — but if you don’t do anything, you won’t learn anything.

I have been asked to write six columns about common scientific mistakes. The mistakes I see are mostly mistakes of omission.

A few years ago I visited a pediatrician in Stockholm. She was interested in the connection between sunlight and illness (children are much healthier in the summer) and had been considering doing a simple correlational study. When she told her colleagues about it, they said: Your study doesn’t control for X. You should a more difficult study. It was awful advice. In the end, she did nothing.

Science is all about learning from experience. It is a kind of fancy trial and error. But this modest description is not enough for some scientists, who create rules about proper behavior. Rule 1. You must do X (e.g., double-blind placebo-controlled experiments). Rule 2. You must not do Y (e.g., “uncontrolled” experiments). Such ritualistic thinking is common in scientific discussions, hurting not only the discussants — it makes them dismissive — but also those they might help. Sure, some experimental designs are better than others. It’s the overstatement, the notion that experiments in a certain group are not worth doing, that is the problem. It is likely that the forbidden experiments, whatever their flaws, are better than nothing. A group that has suffered from this way of thinking is people with bipolar disorder. Over the last thirty years, few new treatments for this problem have been developed. According to Post and Luckenbaugh (2003, p. 71), “many of us in the academic community have inadvertently participated in the limitation of a generation of research on bipolar illness . . . by demands for methodological purity or study comprehensiveness that can rarely be achieved.”

Rituals have right and wrong. Science is more practical. The statistician John Tukey wrote about ritualistic thinking among psychologists in an article called “Analyzing data: Sanctification or detective work?” (Tukey, 1969). One of his examples involved measurement typology. The philosopher of science N. R. Campbell had come up with the notion, popularized by Stevens (1946), that scales of measurement could be divided into four types: ratio, interval, ordinal, and nominal. Weight and age are ratio scales, for example; rating how hungry you are is an ordinal measure. The problem, said Tukey, were the accompanying prohibitions. Campbell said you can add two measurements (e.g., two heights) only if the scale is ratio or interval; if you are dealing with ordinal or nominal measures, you cannot. The effect of such prohibitions, said Tukey, is to make it less likely that you will learn something you could have learned. (See Velleman and Wilkinson, 1993, for more about what’s wrong with this typology.)

I fell victim to right-and-wrong thinking as a graduate student. I had started to use a new way to study timing and had collected data from ten rats. I plotted the data from each rat separately and looked at the ten graphs. I did not plot the average of the rats because I had read an article about how, with data like mine, averages can be misleading — they can show something not in any of the data being averaged. For example, if you average bimodal distributions you may get a unimodal distribution and vice-versa. After several months, however, I averaged my data anyway; I can’t remember why. Looking at the average, I immediately noticed a feature of the data (symmetry) that I hadn’t noticed when looking at each rat separately. The symmetry was important (Roberts, 1981).

A corollary is this: If someone (else) did something, they probably learned something. And you can probably learn something from what they did. For a few years, I attended a meeting called Animal Behavior Lunch where we discussed new animal behavior articles. All of the meetings consisted of graduate students talking at great length about the flaws of that week’s paper. The professors in attendance knew better but somehow we did not manage to teach this. The students seemed to have a very strong bias to criticize. Perhaps they had been told that “critical thinking” is good. They may have never been told that appreciation should come first. I suspect failure to teach graduate students to see clearly the virtues of flawed research is the beginning of the problem I discuss here: Mature researchers who don’t do this or that because they have been told not to do it (it is “flawed”) and as a result do nothing.

References

Post RM, Luckenbaugh DA.. Unique design issues in clinical trials of patients with bipolar affective disorder. J Psychiatr Res. 2003 Jan-Feb;37(1):61-73.

Roberts, S. (1981). Isolation of an internal clock. Journal of Experimental Psychology: Animal Behavior Processes, 7, 242-268.

Stevens, S.S. (1946). On the theory of scales of measurement. Science, 103, 677-680.

Tukey, J. W. (1969). Analyzing data: Sanctification or detective work. American Psychologist, 24, 83-91.

Velleman PF, Wilkinson L. Nominal, Ordinal, Interval, and Ratio Typologies Are Misleading. The American Statistician, Vol. 47, No. 1. (1993), pp. 65-72.

Cure Versus Prevention (flies edition)

How to reduce flies? Here’s one way:

A Chinese city suburb has set a bounty on dead flies in a bid to promote public hygiene . . . Xigong, a district of Luoyang in the central province of Henan, paid out more than 1,000 yuan ($125) for about 2,000 dead flies on July 1, the day it launched the scheme with the aim of encouraging cleanliness in residential areas. . . An Internet user said that although the office had good intentions, the action itself had made the district a laughing stock.

“The key point is the government should encourage residents to clean up the environment so that no flies can live there, instead of spending money on dead flies,” the Internet user wrote.

Yes. This gets back to Erika Schwartz’s criticism of Gina Kolata and the NY Times for not mentioning prevention in an article about strokes. Kolata’s article accurately reflected the situation: far more interest in (i.e., money spent on) cure than prevention. It makes as much sense in America as it does in China.

Norman Temple and I wrote about a related problem: more support for high-tech than low-tech research, even though low-tech research has been more helpful. The low-tech research is more prevention-related.

More health-care absurdity.

Omega-3: I Can See For Myself

“The flax seed oil scam” by a herbalist named Henriette says bad things about flaxseed oil. One is about (lack of) conversion of ALA (the short-chain omega-3 in flaxseed oil) to EPA and DHA (the long-chain omega-3s found in fish oil and presumably active in the brain):

The scam is in flax seed oil folks trying to maintain that we can convert ALA into EPA and DHA in anything like relevant amounts.

We can’t. We convert at most 10 %, but usually less than half that.

Which is “fairly common knowledge among nutritionists,” says Henriette. She quotes the abstracts of two scientific papers to support this point. The other criticism is that flaxseed oil goes bad quickly:

I dislike flax seed oil for another reason as well: it oxidizes (goes rancid) pretty much the minute it’s pressed, and unless it’s been refrigerated ALL the way from press to consumer, it’s ALWAYS rancid.

After I read this, I realized I was in an unusual position. When it comes to flaxseed oil, I don’t have to take anyone’s word for it. I have been able to measure the benefits by myself on myself. Apparently the conversion ratio, whatever it is, is high enough; and the suppliers of my flaxseed oil (I have used Spectrum Organic, Barlean’s, and the Whole Foods house brand) have solved the oxidation problem.

With almost every other nutrient, my knowledge is far less certain. Sure, I need some Vitamin C, but how much is best? Too much may cause cancer. I’ll probably never know the best amount for the average person, much less the best amount for myself.

Creepy Assertions

The ability of patients to try experimental drugs outside of clinical trials has a lot in common with self-experimentation. The former empowers the patient; the latter empowers the amateur scientist. Another form of health-related empowerment is to allow people to buy and sell organs. Of course, some people are against this:

Nancy Scheper-Hughes, a Berkeley anthropologist — now in residence at Harvard’s Radcliffe Institute — has documented how wealthy organ brokers exploit the impoverished in places like Moldova and South Africa. She cites a moral parable . . . A starving man adrift with others on a raft does not have the right to eat his fellow passengers. [Huh?] Scheper-Hughes suggests there is something of the same “predatory” aspect to organ sales — a creepy assertion “that I have the right to the body of another person, to live.”

From the Boston Globe. To me, the creepy assertion is “I, Professor Scheper-Hughes, know better than other people what they should do with their own bodies.” Alas, this sort of professorial arrogance is common. I encountered it with the UC Berkeley Committee for the Protection of Human Subjects: I must have a certain control group in my experiment, they said. As if they knew how to do my research better than I did. I once heard an NPR commentator, describing her IRB participation, boast about this: “Sometimes a control was missing, or we felt the study was misguided.” A website about IRB abuses has many similar stories.

Science in Action: Omega-3 (a surprise!)

I have always stopped self-experimenting when I travel because so much changes. Surely I will sleep differently, etc., far from home. However, it is not so obvious my arithmetic speed (how fast I do arithmetic problems such as 6 + 3) will change. I am measuring arithmetic speed as part of my study of omega-3 (directory).

I recently spent a week in Los Angeles. For the first time I continued self-experimentation while traveling. When I arrived I bought a bottle of flaxseed oil. I continued to take 4 T/day and did the same mental-function tests I do at home: arithmetic, memory-scanning, and balance. I have described these tests in other posts.

My balance was much worse in Los Angeles, apparently because what I see during the test changed (because the floor and other surroundings are different). I hadn’t realized how much that mattered. My arithmetic and memory-scanning results were roughly the same as the results at home — that is, until the last day. This graph shows arithmetic speeds:

arithmetic speed

This graph shows memory-scanning results:

memory scanning speeds

The sudden improvement on the last day — also clear in the balance test — was a big surprise. It was too large to be due to practice, nor could it be due to being in LA — the previous 5 measurements were also in LA. It did, however, have a ready explanation: The previous night I had gotten back late and had forgotten to take the oil. So instead of taking 4 T at 11 pm I took it at 7 am. I did the tests at about noon. Instead of 8 or 9 hours between oil ingestion and test, in this case the difference was 5 hours.

If this explanation is correct, there is a short-lived effect of flaxseed oil on brain function — present 5 hours after ingestion but absent or weaker 8 hours later. Which, as a scientist, makes me say “Wow!” If this effect exists, it’s a new tool, the most precious and powerful thing in science. I can use it to compare amounts of flaxseed oil, oils (e.g., fish oil), and foods (e.g., salmon).

My current way of measuring omega-3 effects requires one/day tests repeated for weeks. When I reduced the amount of flaxseed oil I was taking from 4 T/day to nothing, it took more than a day with the lower dose before performance even went down, and many more days before performance stabilized. This meant that experiments had to last several weeks. If the new effect exists, it will allow much faster experiments.

Strengths and Weaknesses of Epidemiology: A Semi-Insider’s View

On BART I met a graduate student in epidemiology. “What are the strengths and weaknesses of epidemiology?” I asked. Strengths:

1. It asks important questions. What causes cancer? for example.

2. The results are useful. They can guide public policy. If you learn that smoking causes cancer, you can start an anti-smoking campaign. Epidemiological results can also lead to informative experiments: Epidemiology suggests that X causes cancer, you do an experiment to test that conclusion.

Weaknesses:

1. Health is complicated, controlled by many things. Presumably this is why studies often have conflicting conclusions.

2. There is enough flexibility in data analysis that your original hypothesis may influence the way that you analyze your data.

I use epidemiology all the time — here, for example. It often makes an interesting idea more plausible. My ideas about depression, derived from studying the effects of seeing faces, became more plausible to me because of the epidemiology of depression.

A New and Useful Word

The word is black-and-white-ism. For instance:

Berman’s chief problem as a thinker is black-and-white-ism, and this is a good example of his failure to make subtle distinctions.

Scientists are guilty of black-and-white-ism all the time: this statistic is wrong, that way of doing things is a mistake, and so on. John Tukey wrote about this tendency in a paper called “Analyzing data: Sanctification or detective work?” If you believe data analysis is sanctification, there are indeed right ways and wrong ways, as with any ritual. But if science is not a set of rituals, talking about right and wrong confuses graduate students — who begin to think science is a set of rituals — and restricts what you can do. After you say something is wrong, it is harder to do it.

Science in Action: Omega-3 (conference submission)

A few days ago I submitted a title and abstract for a talk to be given at the November 2007 meeting of the Psychonomic Society, a group of experimental psychologists:

Rapid Effects Of Omega-3 Fats On Brain Function

I measured the effect of omega-3 fats on my brain by comparing flaxseed oil (high in omega-3) with other plant fats (low in omega-3) and with nothing. Flaxseed oil improved my balance, increased my speed in a memory-scanning task and in simple arithmetic problems, and increased my digit span. The first three effects were very clear, t > 6. The effects of flaxseed oil wore off in a few days and appeared at full strength within a day of resumption. The best dose was at least 3 tablespoons/day, much more than most flaxseed-oil recommendations. Supporting results come from three other subjects. Because the brain is more than half fat, it is plausible that type of dietary fat affects how well it works. The most interesting feature of these results is the speed and clarity of the improvement. The tools of experimental psychology may be used to determine the optimal mix of fats for the brain with unusual clarity.

If I ever made a time line for my life, this submission would be one of the events.

Directory of my omega-3 posts.

One-Sided Critiques of the Day

Here is an example of the negative evaluation bias I mentioned earlier. Larry Sanger criticizing a comparison of Wikipedia and the Encyclopedia Britannica:

Some might point to Nature’s December 2005 investigative report—often billed as a scientific study, though it was not peer-reviewed—that purported to show, of a set of 42 articles, that whereas the Britannica articles averaged around three errors or omissions, Wikipedia averaged around four. Wikipedia did remarkably well. But the article proved very little, as Britannica staff pointed out a few months later. There were many problems: the tiny sample size, the poor way the comparison articles were chosen and constructed, and the failure to quantify the degree of errors or the quality of writing. But the most significant problem, as I see it, was that the comparison articles were all chosen from scientific topics. Wikipedia can be expected to excel in scientific and technical topics, simply because there is relatively little disagreement about the facts in these disciplines. (Also because contributors to wikis tend to be technically-minded, but this probably matters less than that it’s hard to get scientific facts wrong when you’re simply copying them out of a book.) Other studies have appeared, but they provide nothing remotely resembling statistical confirmation that Wikipedia has anything like Britannica-levels of quality. One has to wonder what the results would have been if Nature had chosen 1,000 Britannica articles randomly, and then matched Wikipedia articles up with those.

“Tiny sample size”? Hmm. How often have you heard “the sample size was too large”?

Here is another example of a one-sided critique: her advisor’s reaction to her work (“My advisor started out tearing apart the things I had done”).