"Long before it's in the papers"
August 27, 2015


Fewer than half of psychology papers met replication test, study finds

Aug. 27, 2015
Special to World Science

Only 39 out of a sam­ple of 100 re­cently pub­lished sci­en­tif­ic re­ports in psy­chol­o­gy stood up to an at­tempt to rep­li­cate the find­ings, ac­cord­ing to a new anal­y­sis.

That does­n’t prove all the orig­i­nal find­ings were wrong, but it does raise con­cerns, ac­cord­ing to the au­thors of the study and oth­ers.

It “sug­gests that there is still more work to do to ver­i­fy wheth­er we know what we think we know,” the au­thors wrote, pub­lish­ing their find­ings in the Aug. 28 is­sue of the jour­nal Sci­ence.

That orig­i­nal ex­pe­ri­ments can be re­pro­duced with si­m­i­lar re­sults “is a co­re prin­ci­ple of sci­en­tif­ic pro­gress,” they wrote. The idea is that “sci­en­tif­ic claims should not gain cre­dence be­cause of the sta­tus or au­thor­ity of their orig­i­na­tor but by the repli­ca­bil­ity of their sup­port­ing ev­i­dence.”

But there is lit­tle da­ta show­ing how many past stud­ies hold up to this scru­ti­ny, they added. The con­cern is that the search for truth may be com­prom­ised by a wide­spread bias putting too much of a premium on results that are simply in­ter­est­ing. Such a bias can affect both scien­tists, and edi­tors selecting their work to appear in the aca­demic journ­als that dis­sem­in­ate their findings.

“This very well done study shows that psy­chol­o­gy has noth­ing to be proud of when it comes to replica­t­ion,” Charles Gal­lis­tel, pres­ident of the As­socia­t­ion for Psy­cho­log­i­cal Sci­ence, told Sci­ence in a news ar­ti­cle ac­com­pa­nying the find­ings.

But the au­thors and oth­ers stressed that the prob­lem is­n’t un­ique to psy­chol­o­gy. 

“We in­ves­t­i­gated the re­pro­ducibil­ity rate of psy­chol­o­gy not be­cause there is some­thing spe­cial about psy­chol­o­gy, but be­cause it is our dis­ci­pline,” wrote the re­search­ers, col­la­bora­t­ion led by Bri­an Nosek of the Un­ivers­ity of Vir­gin­ia.

“Con­cerns about re­pro­ducibil­ity are wide­spread across dis­ci­plines,” they added. “Re­pro­ducibil­ity is not well un­der­stood be­cause the in­cen­tives for in­di­vid­ual sci­en­tists pri­or­i­tize nov­el­ty over replica­t­ion. If noth­ing else, this proj­ect demon­strates that it is pos­si­ble to con­duct a large-scale ex­amina­t­ion of re­pro­ducibil­ity de­spite the in­cen­tive bar­ri­ers.”

The pa­per’s 270 con­tri­but­ing au­thors, part of an ef­fort known as the Re­pro­ducibil­ity Proj­ect, col­la­bo­rat­ed with the au­thors of the orig­i­nal find­ings in car­ry­ing out replica­t­ion ex­pe­ri­ments.

“A di­rect replica­t­ion may not ob­tain the orig­i­nal re­sult for a va­ri­e­ty of rea­sons,” the re­port cau­tioned. Either study could be wrong. And “known or un­known dif­fer­ences be­tween the replica­t­ion and orig­i­nal study may mod­er­ate the size of an ob­served ef­fect.”

The au­thors de­fined suc­cess­ful replica­t­ion as de­pend­ing on sev­er­al cri­te­ria. One was wheth­er the re-run of an ex­pe­ri­ment matched the orig­i­nal in yield­ing what are con­sidered sta­tis­tic­ally “sig­nif­i­cant"—or sta­tis­tic­ally insig­nif­i­cant—re­sults. 

Look­ing at this meas­ure, they not­ed, 97 of the orig­i­nal stud­ies, but only 36 of the replica­t­ions, had sta­tis­tic­ally sig­nif­i­cant re­sults.  Over­all, “replica­t­ion ef­fects were half the mag­ni­tude of orig­i­nal ef­fects, rep­re­sent­ing a sub­stanti­al de­cline,” they wrote.

But de­spite the im­por­tance of re­pro­ducibil­ity, the au­thors stressed, it should­n’t al­ways be ex­pected “from the on­set of a line of in­quiry through its matura­t­ion.” 

“This is a mis­take. If in­i­tial ide­as were al­ways cor­rect, then there would hardly be a rea­son to con­duct re­search in the first place. A healthy dis­ci­pline will have many false starts as it con­fronts the lim­its of pre­s­ent un­der­stand­ing.”

* * *

Send us a comment on this story, or send it to a friend

Sign up for


On Home Page         


  • Par­rot found able to “con­clude by exclud­ing”

  • Re­searchers make peo­ple in­decisive by tweak­ing brain waves


  • Study links global warming, war for first time—in Syria

  • Smart­er mice with a “hum­anized” gene?

  • Was black­mail essen­tial for marr­iage to evolve?

  • Plu­to has even cold­er “twin” of sim­ilar size, studies find


  • F­rog said to de­scribe its home through song

  • Even r­ats will lend a help­ing paw: study

  • D­rug may undo aging-assoc­iated brain changes in ani­mals

Only 39 out of a sample of 100 recently published scientific reports in psychology were able to stand up to an attempt to replicate the findings, according to a new analysis. That doesn’t prove all the original findings were wrong, but it does raise concerns, according to the authors of the study and others. It “suggests that there is still more work to do to verify whether we know what we think we know,” the authors wrote, publishing their findings in the Aug. 28 issue of the journal Science. That original experiments can be reproduced with similar results “is a core principle of scientific progress,” they wrote. The idea is that “scientific claims should not gain credence because of the status or authority of their originator but by the replicability of their supporting evidence.” But there is little data showing how many past studies hold up to this scrutiny, they added. “This very well done study shows that psychology has nothing to be proud of when it comes to replication,” Charles Gallistel, president of the Association for Psychological Science, told Science in a news article accompanying the findings. But the authors and others stressed that the problem isn’t unique to psychology. “We investigated the reproducibility rate of psychology not because there is something special about psychology, but because it is our discipline,” wrote the researchers, collaboration led by Brian Nosek of the University of Virginia. “Concerns about reproducibility are widespread across disciplines,” they added. “Reproducibility is not well understood because the incentives for individual scientists prioritize novelty over replication. If nothing else, this project demonstrates that it is possible to conduct a large-scale examination of reproducibility despite the incentive barriers.” The paper’s 270 contributing authors, part of an effort known as the Reproducibility Project, collaborated with the authors of the original findings in carrying out replication experiments. “A direct replication may not obtain the original result for a variety of reasons,” the report cautioned. “Known or unknown differences between the replication and original study may moderate the size of an observed effect, the original result could have been a false positive, or the replication could produce a false negative.” The authors defined successful replication as depending on several criteria. One was whether the re-run of an experiment matched the original in achieving statistically “significant”—or statistically insignificant—results. Looking at this measure, they noted, 97 of the original studies, but only 36 of the replications, had statistically significant results. Overall, “replication effects were half the magnitude of original effects, representing a substantial decline,” they wrote. But despite the importance of reproducibility, the authors stressed, it shouldn’t always be expected “from the onset of a line of inquiry through its maturation.” “This is a mistake. If initial ideas were always correct, then there would hardly be a reason to conduct research in the first place. A healthy discipline will have many false starts as it confronts the limits of present understanding.”