Lack of Repeatability of Cancer Research: The Mystery

In a recent editorial in Nature (gated), the research head of a drug company complained that scientists working for him could not repeat almost all of the “landmark” findings in cancer research that they tried to repeat. They wanted to use these findings as a basis for new drugs. An article in Reuters summarized it like this:

During a decade as head of global cancer research at Amgen, C. Glenn Begley identified 53 “landmark” publications — papers in top journals, from reputable labs — for his team to reproduce. Begley sought to double-check the findings before trying to build on them for drug development. Result: 47 of the 53 could not be replicated.

Yet these findings were cited, on average, about 200 times. The editorial goes on to make reasonable suggestions for improvement based on differences between the findings that could be repeated and those that could not. The Reuters article goes on to describe other examples of lack of reproducibility and includes a story about why this is happening:

Part way through his project to reproduce promising studies, Begley met for breakfast at a cancer conference with the lead scientist of one of the problematic studies. “We went through the paper line by line, figure by figure,” said Begley. “I explained that we re-did their experiment 50 times and never got their result. He said they’d done it six times and got this result once, but put it in the paper because it made the best story.

Okay, cancer research is less trustworthy than someone just barely outside it (Begley) ever guessed. Apparently careerism is one reason why. What is unexplained in both the Nature editorial and the Reuters summary is how research can ever succeed if things aren’t reproducible. Science has been compared to a game of Twenty Questions. Suppose you play Twenty Questions and 25% of the answers are wrong. It’s hopeless. In experimental research, you generally build on previous experimental results. The editorial points out that the non-reproducible results had been cited 200 times but what about how often they had been reproduced in other labs? The editorial says nothing about this.

I can think of several possibilities: (a) Current lab research is based on experimental findings of thirty years ago when (for unknown reasons) careerism was less of a problem. Standards were higher, there was less pressure to publish, whatever. (b) There is a silent invisible “survival of the reproducible”: Findings that can be reproduced live on because people do lab work based on them. The other findings are cited but are not the basis of new work. (c) There is lots of redundancy — different people approach the same question in different ways. Although each individual answer is not very trustworthy their average is considerably more trustworthy.

Leaving aside the mystery (how can science make any progress if so many results are not reproducible?), the lack of reproducibility interests me because it suggests that the pressure to publish faced by professional scientists has serious (bad) consequences. In contrast, personal scientists are under zero pressure to publish.

Thanks to Bryan Castañeda.

14 thoughts on “Lack of Repeatability of Cancer Research: The Mystery

  1. Wow. They did it six times and got the result they published once!? I’m generally a skeptic of medical research, but I still find this amazing.
    Seth: I agree. That this sort of thing is perfectly legal — goes undetected — is one illustration of the fact that enormously prestigious journals, such as Science and Nature, employ peer reviewers with major gaps in their understanding. It’s like an accounting system with a big hole that allows fraud.

  2. This post reminds me ofthis excellent articleabout the shenanigans used by Bristol-Myers Squibb (pharmaceutical company) to gain approval for their antidepressant drug, Serzone. “In testing results submitted to the FDA, Serzone failed to show a clear benefit in six of the eight clinical trials.” And the two positive trials were suspect.

  3. You don’t seem to consider the obvious possibility: That science, in most fields, is simply not making progress, that such “progress” as is reported is accomplished by rewriting the past and imposing official consensus.
    Seth: If “science, in most fields” is no longer making progress, what changed?

  4. As to what caused the problem: Peer review caused the problem. It is career suicide to discover that someone else’s research is irreproducible.
    Seth: Your second sentence (“It is career suicide…”) is an excellent point. I don’t see the connection with your first sentence (“As to what…”).

  5. Is it really career suicide? Depends on the case. Everybody jumped on “cold fusion” immediately and multiple labs rushed to replicate it but were unable to. The reputations of none of these labs were harmed by failure to replicate.
    Of course this was a celebrity case, and a media event.

  6. The reason most drug and disease studies are done is only to support or discredit various products. Proffessional science today is an advertising war — and has nothing to do with reality. Even government-funded studies are likely corrupted by lobbying efforts.
    Labs are funded with an expected outcome. And are repeated, apparently, until that outcome is achieved, with failures discarded annonmously.

  7. Remember how we used to think that people who believed that the country was a giant invisible criminal conspiracy were cranks?

  8. It seems to me that some of the best research is being done by self-taught bloggers like Denise Minger and Paul Jaminet, neither of whom has credentials in nutrition. Almost every professional field has now become a guild that exists primarily to perpetuate itself and not to seek the truth. Education, psychotherapy, literary studies, the arts, and even many of the sciences have become disconnected from the joy of learning.

  9. It’s not completely correct to believe that experiments are just done once and then blindly believed to be true if in a respectable paper. A proper explanation of research methods and equipment used is standard part in scientific articles, so it’s easy to snatch that and make your own things with it. Hence, results are often reproduced in following studies by various researchers, when experiments are built by using previous research methods as a starting point.

  10. garymar: The difference is that cold fusion wasn’t *supposed* to exist. New successful drugs are supposed to exist; if they didn’t, how do these fields & the scientists they employ justify themselves?

  11. I see the situation only slightly differently than James. I see peer review as reproducing the pre-existing status hierarchy.
    When Richard Feynman submits a paper, it is career suicide to reject it. Reviewers are supposed to be anonymous, but in practice cannot rely on that formal reality. If they repeatedly reject papers due to science instead of sociology, they take the risk several times. For example, I keep hearing about reviewer shenanigans, such conflicts of interest, which should be impossible to discover.
    When J random submits a paper, it is career suicide to accept it.
    When J random, Ph.D submits a paper, it is decided based onwhether it makes Feynman look bad. As per the link, if it makes Feynman look bad, the journals will reject it, and get legalistic if J random insists, strenuously pursuing plausible deniability. Feynman will be able to publish multiple replies without difficulty.
    Under certain conditions, s/Richard Feynman/Michael Mann/g

  12. Great post. This is a very real concern in biomedical research. One valid explanation for why some results don’t replicate is complexity. Under a model of complexity and nonlinearity you don’t necessarily expect some things to replicate. I wrote a paper several years ago showing how and why real genetic association results might not replicate as allele frequencies change slightly from study to study. The paper was published here in PLoS One (open):

  13. Hi Seth, great post!
    I think this is a major issue. To address it will require changes in the incentive system that promote collaboration and reproducibility of preclinical academic research. Any new set of incentives should focus, in part, on the funding agencies and journal publications that support academic research. For instance:
    – Funding agencies should allocate some proportion of a grant to allow research groups to include independent replication in their project work flow.
    – Journal editors should encourage and acknowledge independent replication of key results (either pre or post publication).

    Seth: Thanks. I agree with your broad point and I like your specific suggestions.

Leave a Reply

Your email address will not be published. Required fields are marked *