How to Be Wrong (continued)

I asked a friend of mine why she was a good boss. “I was nurturing,” she said. A big study of managers reached essentially the same conclusion: Good managers don’t try to make employees fit a pre-established box, the manager’s preconception about how to do the job. A good manager tries to encourage, to bring out, whatever strengths the employee already has. This wasn’t a philosophy or value judgment, it was what the data showed. The “good” managers were defined as the more productive ones — something like that. (My post about this.)

The reason for the study, as Veblen might say, was the need for it. Most managers failed to act this way. I posted a few days ago about a similar tendency among scientists: When faced with new data, a tendency to focus on what’s wrong with it and ignore what’s right about it. To pay far more attention to limitations than strengths. Here are two examples:

1. Everyone’s heard “correlation does not imply causation”. I’ve never heard a parallel saying about what correlation does imply. It would be along the lines of “something is better than nothing.”

2. Recently I attended a research group meeting in which a postdoc talked about new data she had gathered. The entire discussion was about the problems with it — what she couldn’t infer from it. There could have been a long discussion about how it added to what we already know, but there wasn’t a word about this.

Some of the comments considered this behavior a kind of Bayesian resistance to change in beliefs. But it occurs regardless of whether the new data support or contradict prior beliefs. There’s nothing about prior beliefs in “correlation does not imply causation.” The post-doc wasn’t presenting data that contradicted what anyone knew. Also, similar behavior occurs in other areas besides science (e.g., how managers manage) in which the Bayesian explanation doesn’t fit so well.

I think it’s really strong. I was guilty of it myself when discussing it! I made very clear how this tendency is a problem, giving the analogy of a car that could turn left but not right. Obviously bad. I said nothing about the opportunities this tendency gives everyone. My self-experimentation is an example. The more that others reject useful data, the more likely it is that useful data is lying around and doesn’t require much effort to find. I have called this behavior dismissive; I could have called it generous. It’s like leaving money lying on the ground.

A related discussion at Overcoming Bias. What should “correlation does not imply causation” be replaced with?

Addendum. Barry Goldwater weighs in: “I’m frankly sick and tired of the political preachers across this country telling me as a citizen that if I want to be a moral person, I must believe in ‘A,’ ‘B,’ ‘C,’ and ‘D.’” Indeed, preachers spend far more time on what we are doing wrong (and should do less of) than on what we are doing right (and should do more of). The preacher Joel Osteen has taken great advantage of this tendency. “I think most people already know what they are doing wrong,” he told 60 Minutes.

The Lessons of Bilboquet

There are lots of omega-3-related self-experiments I’d like to do: 1. What about fish oil? 2. Is omega-6 bad for the brain? As my olive-oil results suggested. 3. “Blind” experiments where I don’t know what I’ve ingested. I wanted to use a design that involved many tests/day. This would be easy if the tests were fun, hard if they weren’t. Games are fun–could I figure out why and make a mental test that was like playing a game?

After talking with Greg Niemeyer, I decided that color, variety, feedback, and appropriate difficulty (not too little, not too much) were possible reasons games are fun. I constructed a letter-counting task with all of these attributes — and it wasn’t fun. I had to push myself to do it. These attributes may help, but not a lot.

Then, as I’ve posted, a friend gave me a bilboquet. For such a simple object, it was surprisingly fun and slightly addictive. Thinking about other addictive games, such as Tetris (I once played a lot of Tetris), I guessed that the crucial features of a game that make it addictive are: 1. Success is sharply defined. 2. Not too easy. 3. Hand-eye coordination. (Not any eye-body coordination: I did thousands of balancing tests but had no trouble stopping.)

I constructed a new task with these attributes: Click the Circle. A circle appears on the screen, you move the pointer to the circle and click on it; a new circle appears somewhere else, you move the pointer to click on it, etc. At the end there’s a little feedback: how long it took. Very simple.

This task, at least so far, is addictive. I think something else may be going on in addition to the three factors: we enjoy completion, especially visual completion. (Which Tetris had a lot of.) In this case the visual completion is the blank space that appears when I click on a circle. If I have a few dishes to do, it’s easy to do them–the promise of an empty sink (= visual completion) draws me to the task. In contrast, if there are a lot of dishes to do, it’s much harder to do a few of them. I’ll probably do none of them or all of them. If you have 20 dishes to do, doing them will generate a lot more pleasure (and thus will be easier to do in the future) if you can manage to create 20 completion moments than if they get piled up and there is only one completion moment.

How to Be Wrong

There are two mistakes you can make when you read a scientific paper: You can believe it (a) too much or (b) too little. The possibility of believing something too little does not occur to most professional scientists, at least if you judge them by their public statements, which are full of cautions against too much belief and literally never against too little belief. Never. If I’m wrong — if you have ever seen a scientist warn against too little belief — please let me know. Yet too little belief is just as costly as too much.

It’s a stunning imbalance which I have never seen pointed out. And it’s not just quantity, it’s quality. One of the foolish statements that intelligent people constantly make is “correlation does not imply causation.” There’s such a huge bias toward saying “don’t do that” and “that’s a bad thing to do” — I think because the people who say such things enjoy saying them — that the people who say this never realize the not-very-difficult concepts that (a) nothing unerringly implies causation, so don’t pick on correlations and (b) correlations increase the plausibility of causation. If your theory predicts Y and you observe Y, your theory gains credence. Causation predicts correlation.

This tendency is so common it seems unfair to give examples.

If you owned a car that could turn right but not left, you would drive off the road almost always. When I watch professional scientists react to this or that new bit of info, they constantly drive off the road: They are absurdly dismissive. The result is that, like the broken car, they fail to get anywhere: They fail to learn something they could have learned.

Addendum. By “too little belief” I meant too little belief in facts — that this or that new observation has something useful to tell us. Thanks to Varangy, who pointed out that there is plenty of criticism of too little belief in this or that favored theory. You could say it is a kind of conservatism.

Interview with Gary Taubes (part 9)

TAUBES Kolata’s response to me reminds me a little of Mike Fumento’s response to me. Did you read that back-and-forth?

INTERVIEWER Yes. This is an example of the litmus test for who the good journalists are and who the bad journalists are. In your Berkeley talk, you quoted Jane Brody: “eating pasta is a good way to lose weight.” There seems to have been some sort of journalistic failure. What was the journalistic failure; what is it?

TAUBES Beginning in the 1960s, when newspapers institutionalized this idea of having diet and health/nutrition writers on newspapers, and its still the case, for the most part, today, the people who got those jobs weren’t the shining intellects on the newspaper, and the shining intellects didn’t want to be diet and health writers. If you’re a whip-smart young guy or girl who wants to go into journalism, you want to be an investigative reporter, a political reporter, or a war correspondent; you don’t want to write about diet and health. Or at least you didn’t. So I think that was one of the problems. You got not very smart people; truly mediocre reporters, doing jobs that turned out to have remarkable significance and influence. I do think that Jane Brody is as responsible as anyone alive for the obesity epidemic. She just bought into this idea of the low-fat diet as a healthy diet, and her sources in New York told her that Atkins was a quack, and that fat was bad, and she never questioned any of it. I don’t know if she had the intellectual wherewithal to do it. In any other field of reporting, as far as I know, reporters are supposed to be as skeptical of their sources as scientists are supposed to be skeptical of their data. Certainly, if George Bush tells a political reporter something, that political reporter doesn’t treat it like it’s true. He might faithfully report what George Bush said, but you’re supposed to be skeptical of what government institutions tell you. So now it’s 1977, the McGovern Committee and the USDA make these proclamations about what constitutes a healthy diet, and there’s simply no skepticism. (With the possible exception of Bill Broad writing in Science magazine, which no one outside the field of science was reading.) So the government tells us that we should eat low-fat diets — and not even learned authorities in the government, but Congressman and USDA bureaucrats channeling 30-year-old congressional staffers — and lo and behold, all these health reporters decide it must be true. That’s the failure. In my fantasy life, I get a call from the managing editors of the New York Times and the Washington Post and the Wall Street Journal and they say they’ve read my book and they want to know how they can improve their health and diet reporting. Because they can see, whether or not I’m 100% right, or 80%, or only 50%, surely their reporters did something wrong. Now there’s a fantasy for you.

INTERVIEWER Yeah, I agree. That makes sense. So, what would you say?

TAUBES I haven’t figured that one out yet. Get some of your political reporters to do the health writing. Get the smarter people on the paper to do it.

INTERVIEWER Well, I always thought of you as one of the very few science writers who was sufficiently skeptical. Practically none of them are.

TAUBES That’s basically the problem. This lack of skepticism. But I had an advantage. . . You’ll remember, in my first book, I got to live at a physics laboratory and I was lied to regularly by a Nobel Prize-winning physicist. His conception of truth was what he needed to be true at the moment, and what he could get people to believe. So if you called him on the lie, and he was kind of a charming fellow, he would acknowledge that he might have misled you, and then he would step back and try another lie, because it wasn’t in his best interest to tell the truth. Then I did this book on cold fusion where I spent three years, basically, getting lied to constantly by anyone who thought it was in their best interest. There was a period in my life where it was hard for me to trust anyone, because I’d just been around too many people who believed that the truth was what was convenient. I also knew, by the time I got into public health reporting, I knew what it took to do good science. So, if somebody wasn’t doing it, I knew there was no reason to put them on a pedestal. The first article I ever wrote for Science magazine was an investigative piece of an alleged fraud that had happened in the cold fusion episode — a fundamental result that kept the field alive for another few months couldn’t be explained by nuclear physics. That alone was so remarkable — as one of the smartest men in the world suggested to me, a physicist named Dick Garwin at IBM — that it should have made everyone suspect fraud. If something can’t be explained by a very well-tested theory, you would question the ethics of the researcher who did the work before you’d question the theory. This is Hume’s idea that eyewitness testimony is never good enough to make you believe in the existence of a miracle, because a miracle is, by definition, something that’s impossible, by all our accepted theories. It’s easier to believe that 10, 100, or 1000 people were deluded or dishonest then it is to believe that the Virgin Mary really did appear in Times Square or whatever your miracle of choice is. Anyway, I’m writing this story for Science about an alleged incidence of fraud that took place at Texas A&M, and the editor had a Master’s degree in mathematics from Texas A&M. He took it upon himself to call some of the professors I interviewed, and he would ask them if they really believed what I said they believed, which was not completely unreasonable, considering I’d never written for the magazine before. But then he would say to me, “Well, I talked to professor so-and-so, and he says he doesn’t believe what you said he believed”. And I would say, “Well, this is six months after the fact. Let’s go look at the lecture he gave six months ago, and here’s the paper he wrote on the lecture, and here’s the sentence where he says what he believed then, which is what we’re writing about.” And this editor’s response was “how could you question him? He’s a PhD, and you’re not.”

INTERVIEWER That’s rather unfortunate.

TAUBES This was around 15 years ago; it’s still one of those memorable moments in my life.

INTERVIEWER This guy was an editor at Science magazine?

TAUBES An editor for the journal Science.

INTERVIEWER Oh yeah — that’s really bad. Really, really bad.

TAUBES It’s a common response you see —- what right does Taubes have to say this stuff? He’s not a scientist. It’s like “The Wizard of Oz,” where in order to be a scientist or be taken seriously in science, somebody has to first give you the piece of paper?

INTERVIEWER On a scale of sharpness of criticism, from one to a hundred, that ranks about a zero.

Interview directory.

Jonathan Schwarz, Philosopher of Science

In his New Year’s Resolutions, Jonathan Schwarz vowed to “accentuate the positive”:

At all times in history, there have been zillions of people doing wonderful things with little recognition. 99% of the attention goes to various monsters. Even when the attention is extremely negative (i.e., people like us yowling about Dick Cheney or Thomas Friedman) it suggests the monsters are the only ones doing anything important, and the rest of us have nothing better to do than talk about them.

This is empirically wrong. And it saps our capability for independent thought, because it orients us toward reacting to the powerful, rather than acting ourselves.

This is especially pernicious in a period when technology is opening up ways to build new and better institutions. While I understand the visceral appeal of dumping a bucket of pig excrement on Fred Hiatt, this takes time away from what will have a longer-term impact: nurturing our own fledgling efforts.

This is similar to what I wrote — in the context of a NY Times review of a book about “pseudoscience” — about skeptics being a dime a dozen and what’s really lacking is sophisticated appreciation.

A Different Sort of Scientific Progress: Toward Utility

From a excellent column by Tim Hartford:

Esther Duflo, a French economics professor at MIT, wondered whether there was anything that could be done about absentee teachers in rural India, which is a large problem for remote schoolhouses with a single teacher. Duflo and her colleague Rema Hanna took a sample of 120 schools in Rajasthan, chose 60 at random, and sent cameras to teachers in the chosen schools. The cameras had tamper-proof date and time stamps, and the teachers were asked to get a pupil to photograph the teacher with the class at the beginning and the end of each school day.

It was a simple idea, and it worked. Teacher absenteeism plummeted, as measured by random audits, and the class test scores improved markedly.

Another young economist, Ben Olken of Harvard, used a similar randomisation technique to work out whether corruption in Indonesian road-building projects was best fought top-down, using audits, or bottom-up, soliciting comments from local villagers about whether money was being embezzled. One challenge was to work out how much embezzlement was taking place. Olken enlisted engineers to take samples of the road’s structure and to estimate how much it should have cost to build; he compared that estimate with how much spending was claimed in the project’s accounts. The missing funds were a rough guide to the amount embezzled.

In contrast to Duflo’s results, Olken found that the bottom-up monitoring was not effective – it shifted the embezzlement from something the villagers cared about (wages) to something they did not (building materials). The threat of a guaranteed audit – a threat that was later carried out – was much more effective, reducing the estimates of missing funds by a third.

A chapter in The Theory of the Leisure Class by Thorstein Veblen is about academia and the tendency toward uselessness, which Veblen explained as a way of signalling that one doesn’t need to work. As a general rule at research universities, useful = low status. A few years ago, I had lunch with an engineering professor. By far the most useful thing to come out of the UC Berkeley Electrical Engineering Department in the last 20 years, he told me, was a circuit analysis program (SPICE). Used everywhere. A big contribution to the field. Who did it? I asked. He didn’t know. That’s how low-status it was — no professor wanted to be closely associated with it.

The curious thing about the two examples that Hartford describes is that they are happening at the same time. Is this a coincidence? Or is there an explanation?

Peter Pronovost’s research on ICU checklists is far more useful than one would expect from a medical school professor; likewise my self-experimentation about everyday problems (e.g., poor sleep) was far more useful than one would expect from a psychology professor. So perhaps there is some sort of larger discipline-spanning force at work.

Interview with Gary Taubes (part 2)

INTERVIEWER What do you think about prions?

TAUBES Here’s the problem with prions: the claim is that here’s a radical discovery — an infectious agent that doesn’t have nucleic acid — and it’s based fundamentally on a negative result, which is that when researchers have gone looking for the nucleic acids they failed to find them. Therefore, so the logic goes, they must not be there. The original claim, by Stan Pruisner, another Nobel Prize winner, was premature. He made some claims in his early papers that were definitively wrong. Yet everything he’s done since then supports his initial claim, which suggests he’s was either remarkably lucky to begin with, or he’s only capable of interpreting his results so that they agree with his preconceptions. One of the themes in all of my work is that if you go public on premature data, what happens is that the motivation to do really good science ceases. By “really good science”, what you’re supposed to do, as brutally as you can, is to try to come up with tests that would refute your own hypothesis. The idea is that if your hypothesis survives every rigorous test you can imagine, and all those that everyone else can imagine, then you can start believing itss true. But once you’ve staked a claim based on premature data — once you’ve gone out on a limb without doing any of those rigorous tests — now your motivation becomes to prove that you were right., which you can never do in any case. But the point is that you stop trying to refute your hypothesis, and you start trying to accumulate evidence that supports it and the latter isn’t science. It’s more like what happens in religions.

INTERVIEW That’s what happened with Peter Duesberg. He was a good scientist until he started making claims about HIV.

TAUBES When I wrote this prion article in 1987, the science was so bad that it was a joke. Still, I never said that the prion concept wasn’t correct; I just said there was excruciatingly little evidence to support it, and there were plenty of reasons to believe it was wrong. How do you get strains of an infectious agent without nucleic acids (RNA or DNA) to encode the information in the strains? If you actually look today, even though Prusiner has won the Nobel Prize, if you go to the WHO website or the NIH website and you read up on prions, you’ll see that it’s still considered a hypothesis. There’s still no way to explain how you can get strains without a virus. Prusiner has these ideas, but they’re along the lines of now “a miracle happens”. It’s another long story, but one of the problems (and this is a theme in my book), when you let an untested hypothesis grow and infect the science to the point where people start to believe it’s true, even though it’s never been rigorously tested, the obstacles against ever overturning it get bigger and bigger. It’s like the dietary fat hypothesis: you let it sit around for 40 years, and it evolves to the point that people consider it dogma; it’s virtually impossible to overturn it. The situation with prions isn’t so bad because the public doesn’t care about prions the way that they care about diet, but once the Nobel Prize is awarded, even though it’s still considered a hypothesis, people tend to ignore the studies that suggest it’s wrong. There’s one researcher from Yale who is constantly publishing evidence in major journals that she’s found the nucleic acids, and people just ignore her. They believe the question has already been answered.

INTERVIEWER What’s her name?

TAUBES Laura Manuelidis.

Interview directory
.

Can Anti-Depressants Cause Suicide?

Many parents have said yes. David Healy, a Scottish psychiatrist, prompted by those stories, did a small experiment in which undepressed persons took anti-depressants. About 10% of them started having suicidal thoughts. Drug companies and the University of Toronto (where Healy had been offered a job) reacted very badly to this information, as Healy describes in Let Them Eat Prozac. An article in the latest issue of the American Journal of Psychiatry by David Leon, a biostatistician on the FDA oversight panel, describes why he voted to extend a warning about this from children (< 18 years old) to young adults (18-24 years old). This was the main data:

risk ratios by age

What’s shown is the odds ratio for a report of suicide ideation or behavior, comparing those who got anti-depressants with those who got placebos. An odds ratio of more than 1 means greater risk in those who got anti-depressants. The red bar is from a different study. When different ages are lumped together there is no increase in risk but that hides opposing tendencies at high and low ages.

The article contains this curious sentence: “The results did not provide definitive evidence of risk, yet they failed to demonstrate an absolute absence of risk.” No possible results could “demonstrate an absolute absence of risk” so it is unclear what Dr. Leon meant. Later he writes: “My vote to extend the black box warning to young adults was based on concern that risk of suicidality could not be ruled out and, given the widespread antidepressant use, even a small risk must not be ignored.” Yes, he has it backwards: The data do not “fail to rule out” suicide risk (no possible data could “rule out” such risk, i.e., show the risk is zero); they manage to overcome a barrier to show it’s there. And yes, he’s congratulating himself (”even a small risk must not be ignored”) for doing his job.

Uh-oh. That someone — a biostatistician, no less — in such a powerful regulatory position fails to understand basic concepts is bad enough; to make things worse, Dr. Leon has received money from three of the companies (Eli Lilly, Organon, and Pfizer) he oversees.

Related post by Andrew Gelman.

The Power of Placebos Over Health Journalists

In the New York Times, Abigail Zuger, an M.D., recently reviewed a book called Snake Oil Science: The Truth About Complementary and Alternative Medicine by R. Barker Bausell — the “truth” being, if I read Zuger correctly, that it’s all baloney. Zuger calls the book “immensely educational”. Not educational enough:

Dr. Bausell starts out with the story of his late mother-in-law, Sarah, a concert pianist who developed painful arthritis in her old age and found her doctors to be generally useless when it came to satisfactory pain control. “So, being an independent, take-charge sort of individual, she subscribed to Prevention magazine, in order to learn more about the multiple remedies suggested in each month’s issue” for symptoms like hers.

What ensued, according to Dr. Bausell, was a predictable pattern. Every couple of months Sarah would make a triumphant phone call and announce “with great enthusiasm and conviction” that a new food or supplement or capsule had practically cured her arthritis. Unfortunately, each miracle cure was regularly replaced by a different one, in a cycle her son-in-law ruefully breaks down for detailed analysis.

Neither Bausell nor Zuger notice two problems here: (a) The alternative treatments worked better than the conventional ones. They didn’t provide permanent relief, true, but apparently conventional medicine (”useless”) didn’t provide any relief. Something is better than nothing — and something is wrong with Bausell’s interpretation of this story. (b) Why didn’t the conventional treatments benefit from the placebo effect?

That Zuger thinks this story supports her claim that the book is good suggests the power of pre-conceived notions, not the power of placebos.

The book is published by Oxford University Press. Bausell has a Ph.d. in Educational Research and works as a methodologist.

Not only do Bausell and Zuger fail to see what the mother-in-law story means, they fail to grasp a larger point: Skeptics are a dime a dozen. The attitude in short supply is sophisticated appreciation.

Thanks to Dave Lull and Hal Pashler.

Science in Action: Procrastination (evidence or anti-evidence?)

Evidence is the raw fuel of science: We collect data, it pushes forward our understanding. But there is also anti-evidence: observations that have the effect of holding back our understanding. The clearest example I know comes from experiments that supposedly “tested” mathematical learning theories in the 1950s and later. The observation was that the theory could fit the data. Theorists wrote papers to report this observation. In fact, the theory was so flexible it could fit any plausible results. The papers, which were taken seriously, retarded the study of learning because they wasted everyone’s time. They gave the illusion of progress. Hal Pashler and I wrote about this.

Another example of anti-evidence, I think, is the sort of data that linguistic theorists have been fond of: Observations that this or that sentence or sentence fragment strikes the theorist as grammatical, i.e., possible. Not studies of how people actually talk; the observation that a speaker of English or whatever could say this or that. The theorist’s judgment based on introspection. I’m not saying that this isn’t actual data of some sort; I just suspect that the value of these sorts of observations has been overrated and the net effect has been to keep linguists from collecting data that would push theorizing forward.

Months ago I blogged about how I found that when I made playing a game contingent upon clearing off my kitchen table, I was able to clear off the table. Which had been messy for quite a while. My question: is this evidence or anti-evidence? If I think about this, and try to understand it, will I be deluding myself, as the mathematical learning theorists and the linguistic theorists deluded themselves? On its face, it seems like a very ordinary, very narrow observation, much like the observation that “George played with the game Dave brought over” is a possible English sentence. On the other hand, it is something unusual and helpful that actually happened, unlike an observation that this or that is a possible English sentence.

When someone says “the plural of anecdote is not data,” you can be sure their grasp of scientific method is weak; lots of important discoveries have begun with accidental single observations. But those productive single observations are always surprising. My table-clearing observation was slightly surprising…