While introducing Justin Wolfers as guest blogger at Marginal Revolution — which I am greatly looking forward to, since Wolfers is an excellent data analyst — Alex Tabbarok wrote:
An open secret and an open sin in economics is that many empirical studies are difficult to replicate, even when journals supposedly require authors to make their data publicly available.
Which reminds me. Several months ago, I read an article in a psychology journal about a topic I care a lot about. The conclusions of the article were the opposite of what I think is the case. Was I wrong? Possibly — but the data analysis done in the article was unquestionably “wrong” in the sense that (a) it assumed something that was unlikely to be true and (b) it was possible to do a data analysis that didn’t make that unlikely assumption. I don’t think my opinion here is controversial; I think a blunt but fair summing up of the situation is that the authors made a big mistake.
I was in New Orleans a few weeks after the article appeared. Someone in an art gallery told me the conclusion of the paper! Which is only to say it is a really interesting conclusion. Anyway, I wrote to the first author of the paper (a graduate student) to explain my concern about their conclusions and to ask for the data, so I could do a better analysis. Two weeks went by, no answer. I sent a reminder email, and got this answer:
We typically do not give out our original data, but when I get a chance, I will run the analyses in HLM and get the results back to you. Thanks for your interest in the study,
Wow! It is the policy of the journal in which the paper was published that the data be made available. A month passed. When do you expect to run these analyses? I wrote. A month passed with no answer. I wrote to the faculty member who was a co-author on the paper. Finally I got an answer from the student:
I have been meaning to respond to your email & I apologize for not getting back to you sooner. I am a graduate student and am traveling for the summer. I understand the difficulty with the [blank] situation and am assuming that HLM would be a good way to work through that. However, I am not familiar with the procedure, so it will not be until late August/ early September when I can get a statistician here at [blank] to teach me the procedure. If you have specific suggestions about the analyses, please let me know and I will keep that in mind when I get a chance to work with it. We should have some follow-up data coming in as well so it will be good to learn the procedures for future research. Thanks for your interest in the study.
The story so far is uncomfortably close to what happened when Saul Sternberg and I questioned Ranjit Chandra‘s data. Similarity 1: He never provided the data. Similarity 2: It took a remarkably long time and several emails to get any response. Similarity 3: The response, when it finally came, was only vaguely reassuring. However, in this case, I predict the better analysis will actually be done. Which is good — I would rather someone else do them.
That was well worth reading.