Brian Nosek’s Reproducibility Project touched on a major problem in research: the publish or perish dilemma. To get hired by a university, to get tenure, and to get research grants—all the gateways associated with a successful university career—you have to publish in good journals and publish a lot. “Publish or perish” they say, and that incentivizes researchers to get published no matter what.
So, what does this have to do with the Reproducibility Project, probably the most famous study in social sciences in the last decade? You can hear Nosek talk about it in this great podcast, Planet Money’s The Experiment Experiment, but I’ll give you the lowdown here as well. Basically, he identified a replication crisis.
In 2011, Nosek was disturbed by a paper in a top journal that showed that people could predict the future, as in predicting a significant number of coin tosses correctly. The experiments reported in the paper were done by a highly regarded researcher, conducted rigorously, and fully peer reviewed. Nosek couldn’t find anything wrong, so that made him wonder, does ESP really exist or is there something wrong with the way we do science?
To find out, he and a couple hundred other researchers tried to replicate 100 studies that were published in the top three journals of psychology. They took extra effort to get the same rigorous conditions as the published studies, often with the authors’ help. They expected a few of the new attempts might not get the same results but, to their outright shock, it was more than a few. Over 60% did not replicate. Almost two thirds.
So why not? The first reason that comes to mind is fraud. This brings us back to the publish or perish problem. People cheat, and “scientific misconduct has existed since the beginning of science” (Hughes, 2013, para 5). Even data used by Gregor Mendel, the Father of Genetics, looks fudged. Did the thousands of peas he counted really fall into an exact 3 to 1 ratio? Of course not. He obviously tossed a few. A priest even! And now too, we often see cases of fraud in the headlines, as with Wakefield, Kaidi, and Obokata, but that is just the tip of the iceberg. Look at this long list of misconduct incidents.
And so, Nosek’s replication crisis shows us that research is fraught with fraud.
Except it doesn’t.
While there might have been some misconduct in the 62 studies that did not replicate, that is not what Nosek’s group found. Nor did they find poor procedures or poor analysis. After all, these papers were published in major journals. That means the studies were done by highly reputed researchers, experts who would not risk lying, and then peer-reviewed by even more experts. Fraud 60% of the time? Highly unlikely.
But what Nosek did find…is even worse.
The system is biased in favor of statistical flukes. If ten people do the exact same experiment with the exact same degree of rigor, the odds are that one of them is likely to get exceptional results, a strong correlation, even if the other nine show none. Get it? It’s odds. And guess which study gets published? This is known as the “File Drawer Effect.” As Nosek tells us, 97% of the psych papers published in major journals have positive results. Those that did not were filed away.
And, though no fraud is involved, the odds of getting a positive result are often juiced by the researchers themselves. “The null hypothesis was not rejected? No positive results? Hmm. I wonder why not? Maybe we didn’t have enough subjects. Let’s keep going.” And they keep going, innocently altering the original plan, trying different tools of analysis, or just doing more of the same thing, until they do see positive results. Then they stop. After all, who would keep doing an experiment just in order to disprove oneself? It is not hard for that underlying desire to publish to lean over and tell the desire for truth to stay seated.
So, there is bias in the system, but there are also ways to counter it. The most powerful is the use of online research registries like this one. If the research plan is entered into a registry before the experiment starts, then any alterations to it later are visible to reviewers. And if those studies that fail to get positive results are also put in the registry, studies that normally go in the file drawer, then we can get a better view of what science has found about a particular topic. Use of registries is a growing trend now and some major publications accept only those papers with registered studies. As we can see here, registration is working.
Of course, it is possible that the results that Nosek got were also a statistical fluke. But that does not matter. He has changed the way we see science, as profoundly as Thomas Kuhn and Gregor Mendel did. Peas be with you.
Curtis Kelly (EdD.) is a professor at Kansai University, a founder of the JALT Mind, Brain, and Education SIG, and producer of the MindBrainEd Think Tanks. He has written over 30 books and given over 500 presentations. His life mission is “to relieve the suffering of the classroom.”