In a recent editorial in Nature (gated), the research head of a drug company complained that scientists working for him could not repeat almost all of the “landmark” findings in cancer research that they tried to repeat. They wanted to use these findings as a basis for new drugs. An article in Reuters summarized it like this:
During a decade as head of global cancer research at Amgen, C. Glenn Begley identified 53 “landmark” publications — papers in top journals, from reputable labs — for his team to reproduce. Begley sought to double-check the findings before trying to build on them for drug development. Result: 47 of the 53 could not be replicated.
Yet these findings were cited, on average, about 200 times. The editorial goes on to make reasonable suggestions for improvement based on differences between the findings that could be repeated and those that could not. The Reuters article goes on to describe other examples of lack of reproducibility and includes a story about why this is happening:
Part way through his project to reproduce promising studies, Begley met for breakfast at a cancer conference with the lead scientist of one of the problematic studies. “We went through the paper line by line, figure by figure,” said Begley. “I explained that we re-did their experiment 50 times and never got their result. He said they’d done it six times and got this result once, but put it in the paper because it made the best story.
Okay, cancer research is less trustworthy than someone just barely outside it (Begley) ever guessed. Apparently careerism is one reason why. What is unexplained in both the Nature editorial and the Reuters summary is how research can ever succeed if things aren’t reproducible. Science has been compared to a game of Twenty Questions. Suppose you play Twenty Questions and 25% of the answers are wrong. It’s hopeless. In experimental research, you generally build on previous experimental results. The editorial points out that the non-reproducible results had been cited 200 times but what about how often they had been reproduced in other labs? The editorial says nothing about this.
I can think of several possibilities: (a) Current lab research is based on experimental findings of thirty years ago when (for unknown reasons) careerism was less of a problem. Standards were higher, there was less pressure to publish, whatever. (b) There is a silent invisible “survival of the reproducible”: Findings that can be reproduced live on because people do lab work based on them. The other findings are cited but are not the basis of new work. (c) There is lots of redundancy — different people approach the same question in different ways. Although each individual answer is not very trustworthy their average is considerably more trustworthy.
Leaving aside the mystery (how can science make any progress if so many results are not reproducible?), the lack of reproducibility interests me because it suggests that the pressure to publish faced by professional scientists has serious (bad) consequences. In contrast, personal scientists are under zero pressure to publish.
Thanks to Bryan Castañeda.
– Funding agencies should allocate some proportion of a grant to allow research groups to include independent replication in their project work flow.
– Journal editors should encourage and acknowledge independent replication of key results (either pre or post publication).
Seth: Thanks. I agree with your broad point and I like your specific suggestions.