Reproducibility and Replicability Reading List

This reading list was compiled from the hard labors of many other people, especially those who posted their syllabi on the OSF Open and Reproducible Methods site (https://osf.io/vkhbt/).  The selection and organization of these readings was intended as a resource for our current 2018 graduate course on Reproducibility and Replicability, which means we focused primarily on readings that 1) identified the reasons for our current crisis, and 2) provide ways to fix our problems.  

Anyone can comment on the document, so please recommend readings that we may have missed, update readings that may not have complete citations, or provide links to readings.  Feel free to suggest other topics, problematic categorizations of readings, or ways of organizing existing topics differently.

Best regards,

Brent Roberts

Dan Simons

Definitions of reproducibility & replicability

Cacioppo, J. T., Kaplan, R. M., Krosnick, J. A., Olds, J. L., & Dean, H. (2015). Social, Behavioral, and Economic Sciences Perspectives on Robust and Reliable Science. 

Goodman, S. N., Fanelli, D., & Ioannidis, J. P. (2016). What does research reproducibility mean?. Science translational medicine, 8(341), 341ps12-341ps12.

LeBel, E. P., McCarthy, R., Earp, B., Elson, M., & Vanpaemel, W. (in press). A unified framework to quantify the credibility of scientific findings. Forthcoming at Advances in Methods and Practices in Psychological Science. Retrieved from https://osf.io/preprints/psyarxiv/uwmr8

Zwaan, R.A., Etz, A., Lucas, R.E, Donnellan, M.B. (in press).  Making replication mainstream.  Behavior and Brain Sciences.

Martel García, F.. Replication and the Manufacture of Scientific Inferences: A Formal Approach

.International Studies Perspectives 17 (4), 408-425. Retrieved from https://academic.oup.com/isp/article-abstract/17/4/408/2528279?redirectedFrom=fulltext

Houston, we have a problem (and by “we” we mean everyone, not just psychologists or social psychologists)

Alogna, V. K., Attaya, M. K., Aucoin, P., Bahník, Š., Birch, S., Birt, A. R., ... & Buswell, K. (2014). Registered replication report: Schooler and engstler-schooler (1990). Perspectives on Psychological Science, 9(5), 556-578.

Chicago        

Baker, M. (2016). Is there a reproducibility crisis? Nature, 533(7604), 3–5.

 

Begley, C. G., & Ellis, L. M. (2012). Drug development: Raise standards for preclinical cancer research. Nature, 483(7391), 531-533.

 

Bergh, D. D., Sharp, B. M., Aguinis, H., & Li, M. (2017). Is there a credibility crisis in strategic management research? Evidence on the reproducibility of study findings. Strategic Organization, 1476127017701076.

Bouwmeester, S., Verkoeijen, P. P., Aczel, B., Barbosa, F., Bègue, L., Brañas-Garza, P., ... & Evans, A. M. (2017). Registered Replication Report: Rand, Greene, and Nowak (2012). Perspectives on Psychological Science, 12(3), 527-542.

 

Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S., & Munafò, M. R. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365-376.

Cheung, I., Campbell, L., LeBel, E. P., Ackerman, R. A., Aykutoğlu, B., Bahník, Š., ... & Carcedo, R. J. (2016). Registered Replication Report: Study 1 From Finkel, Rusbult, Kumashiro, & Hannon (2002). Perspectives on Psychological Science, 11(5), 750-764. http://journals.sagepub.com/doi/pdf/10.1177/1745691616664694

Cialdini, R. B. (2009). We have to break up. Perspectives on psychological science, 4(1), 5-6.

 

Ebersole, C. R., Atherton, O. E., Belanger, A. L., Skulborstad, H. M., Allen, J. M., Banks, J. B., ... & Brown, E. R. (2016). Many Labs 3: Evaluating participant pool quality across the academic semester via replication. Journal of Experimental Social Psychology, 67, 68-82.

Eerland, A., Sherrill, A. M., Magliano, J. P., Zwaan, R. A., Arnal, J. D., Aucoin, P., ... & Crocker, C. (2016). Registered replication report: Hart & Albarracín (2011). Perspectives on Psychological Science, 11(1), 158-171.

Goldfarb, B., & King, A. A. (2016). Scientific apophenia in strategic management research: Significance tests & mistaken inference. Strategic Management Journal, 37(1), 167-176.

Grabitz, C. R., Button, K.S., Munafò, M. R., Newbury, D. F., Pernet, C. R., Thompson, P. A., & Bishop, D. V. (2018). Logical and methodological issues affecting genetic studies of humans reported in top neuroscience journals. Journal of cognitive neuroscience, 30(1), 25-41. http://doi.org/10.1038/533452a

 

Ioannidis, J. P. (2005). Why most published research findings are false. PLoS Med, 2, e124.

 

Ioannidis, J. P. A. (2012). Why science isn’t necessarily self-correcting. Perspectives on Psychological Science. http://pps.sagepub.com/content/7/6/645.full.pdf

 

Hagger, M. S., Chatzisarantis, N. L., Alberts, H., Anggono, C. O., Batailler, C., Birt, A. R., ... & Calvillo, D. P. (2016). A multilab preregistered replication of the ego-depletion effect. Perspectives on Psychological Science, 11(4), 546-573.

Jennions, M. D., & Møller, A. P. (2003). A survey of the statistical power of research in behavioral ecology and animal behavior. Behavioral Ecology, 14(3), 438-445.

Makel, M. C., & Plucker, J. A. (2014). Facts are more important than novelty: Replication in the education sciences. Educational Researcher, 43(6), 304-316.

Meyer, M.N., & Chabris, C.F. (2014). Why psychologists’ food fight matters. Slate, 31 July. [http://www.slate.com/articles/health_and_science/science/2014/07/replication_controversy_in_psychology_bullying_file_drawer_effect_blog_posts.html]

Morey, R. D., & Lakens, D. (2016). Why most of psychology is statistically unfalsifiable. Unpublished Manuscript Draft.

https://github.com/richarddmorey/psychology_resolution/blob/master/paper/response.pdf

Motyl, M., Demos, A. P., Carsel, T. S., Hanson, B. E., Melton, Z. J., Mueller, A. B., ... & Yantis, C. (2017). The state of social and personality science: Rotten to the core, not so bad, getting better, or getting worse?. Journal of Personality and Social Psychology, 113(1), 34-58.

 

Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.

 

Schweizer, G., & Furley, P. (2016). Reproducible research in sport and exercise psychology: The role of sample sizes. Psychology of Sport and Exercise, 23, 114-122.

see: http://datacolada.org/60)

 

Tackett, J. L., Lilienfeld, S. O., Patrick, C. J., Johnson, S. L., Krueger, R. F., Miller, J. D., ... & Shrout, P. E. (2017). It’s time to broaden the replicability conversation: Thoughts for and from clinical psychological science. Perspectives on Psychological Science, 12(5), 742-756.

 

Valen E. Johnson, Richard D. Payne, Tianying Wang, Alex Asher & Soutrik Mandal (2016): On the reproducibility of psychological science, Journal of the American Statistical Association, DOI: 10.1080/01621459.2016.1240079

Wagenmakers, E. J., Beek, T., Dijkhoff, L., Gronau, Q. F., Acosta, A., Adams Jr, R. B., ... & Bulnes, L. C. (2016). Registered Replication Report: Strack, Martin, & Stepper (1988). Perspectives on Psychological Science, 11(6), 917-928.

The problems that plague us have been plaguing us for a very long time

 

Babbage, C. (1830). Reflections on the Decline of Science in England, and on Some of its Causes. B. Fellowes, 1830. Retrieved from https://books.google.com/books?id=pFxLAAAAcAAJ&source=gbs_navlinks_s

Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. Journal of Abnormal and Social Psychology, 65, 145-153.

De Groot, A. D. (1956/2014). The meaning of “significance” for different types of research [translated and annotated by Eric-Jan Wagenmakers, Denny Borsboom, Josine Verhagen, Rogier Kievit, Marjan Bakker, Angelique Cramer, Dora Matzke, Don Mellenbergh, and Han LJ van der Maas]. Acta psychologica, 148, 188-194.

Dickersin, K. (1990). The existence of publication bias and risk factors for its occurrence. Journal of the American Medical Association, 263, 1385-1399.

 

Elms, A.C. (1975). The crisis of confidence in social psychology. American Psychologist, 30, 967-976.

 

Feynman, R. P. (1974). Cargo cult science. Engineering and Science, 37(7), 10-13.

 

Forscher, B. K. (1963). Chaos in the brickyard. Science, 142(3590), 339.

Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological bulletin, 82(1), 1.

Lykken, D. T. (1968). Statistical significance in psychological research. Psychological Bulletin,

70, 151–159. doi:10.1037/h0026141

Lykken, D. T. (1991). What’s wrong with psychology anyway? In D. Ciccetti & W. Grove

(Eds.), Thinking clearly about psychology (pp. 3–39). Minneapolis: University of

Minnesota Press.

Meehl, P. E. (1967). Theory-testing in psychology and physics: A methodological paradox. Philosophy of Science, 34(2), 103-115.

Meehl, P. E. (1990). Why summaries of research on psychological theories are often uninterpretable. Psychological Reports, 66(1), 195-244.

 

Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological bulletin, 86(3), 638.

Rossi, J. S. (1990). Statistical power of psychological research: What have we gained in 20 years?. Journal of consulting and clinical psychology, 58(5), 646.

Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies?. Psychological bulletin, 105(2), 309.

Sterling, T. D. (1959). Publication decisions and their possible effects on inferences drawn from tests of significance—or vice versa. Journal of the American Statistical Association, 54(285), 30-34.

Wachtel, P. L. (1980). Investigation and its discontents: Some constraints on progress in psychological research. American Psychologist, 35(5), 399.

Walster, G. W., & Cleary, T. A. (1970). A proposal for a new editorial policy in the social sciences. The American Statistician, 24(2), 16-19.

 

The problems that plague us: low power

Aguinis, H. (1995). Statistical power problems with moderated multiple regression in management research. Journal of Management, 21(6), 1141-1158.

Aguinis, H., Beaty, J. C., Boik, R. J., & Pierce, C. A. (2005). Effect size and power in assessing moderating effects of categorical variables using multiple regression: a 30-year review.

Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S., & Munafò, M. R. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365-376.

Cashen, L. H., & Geiger, S. W. (2004). Statistical power and the testing of null hypotheses: A review of contemporary management research and recommendations for future studies. Organizational Research Methods, 7(2), 151-167.

Cohen, J. (1962). The statistical power of abnormal-social psychological research: A review. Journal of Abnormal and Social Psychology, 65, 145-153.

Cremers, H. R., Wager, T. D., & Yarkoni, T. (2017). The relation between statistical power and inference in fMRI. PloS one, 12(11), e0184923.

Dybå, T., Kampenes, V. B., & Sjøberg, D. I. (2006). A systematic review of statistical power in software engineering experiments. Information and Software Technology, 48(8), 745-755.

Fraley, R. C. & Marks, M.J. (2007). The null hypothesis significance-testing debate and its implications for personality research. In R.W. Robins, R.F. Krueger, & R.C. Fraley (eds.), Research Methods in Personality Psychology (Chap 9, 170-189). New York, NY: Guilford Press.

Fraley, R. C., & Vazire, S. (2014). The N-pact factor: Evaluating the quality of empirical journals with respect to sample size and statistical power. PloS one, 9(10), e109019.

Jennions, M. D., & Møller, A. P. (2003). A survey of the statistical power of research in behavioral ecology and animal behavior. Behavioral Ecology, 14(3), 438-445.

Marszalek, J. M., Barber, C., Kohlhart, J., & Cooper, B. H. (2011). Sample size in psychological research over the past 30 years. Perceptual and motor skills, 112(2), 331-348.

Maxwell, S. E. (2004). The persistence of underpowered studies in psychological research: causes, consequences, and remedies. Psychological methods, 9(2), 147.

Moher, D., Dulberg, C. S., & Wells, G. A. (1994). Statistical power, sample size, and their reporting in randomized controlled trials. Jama, 272(2), 122-124.

Nord, C. L., Valton, V., Wood, J., & Roiser, J. P. (2017). Power-up: A reanalysis of ‘power failure’ in neuroscience using mixture modeling. Journal of Neuroscience, 37(34), 8051-8061.

Rossi, J. S. (1990). Statistical power of psychological research: What have we gained in 20 years?. Journal of consulting and clinical psychology, 58(5), 646.

Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies?. Psychological bulletin, 105(2), 309.

Verma, R., & Goodale, J. C. (1995). Statistical power in operations management research. Journal of Operations Management, 13(2), 139-152.

Yarkoni, T. (2009). Big correlations in little studies: Inflated fMRI correlations reflect low statistical power—Commentary on Vul et al.(2009). Perspectives on Psychological Science, 4(3), 294-298.

The problems that plague us: selective publication; bias against the null

Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The rules of the game called psychological science. Perspectives on Psychological Science, 7, 543‑554.

 

Bertamini, M., & Munafò, M. R. (2012). Bite-Size Science and Its Undesired Side Effects. Perspectives on Psychological Science, 7, 67-71

 

Bones, A. K. (2012). We knew the future all along scientific hypothesizing is much more accurate than other forms of precognition—A satire in one part. Perspectives on Psychological Science, 7(3), 307-309.

Ebersole, C. R., Axt, J. R., & Nosek, B. A. (2016). Scientists’ Reputations Are Based on Getting It Right, Not Being Right. PLoS Biol, 14(5), e1002460.

 

Fanelli, D. (2010). “Positive” results increase down the hierarchy of the sciences. PloS one, 5(4), e10068.

 

Fanelli, D. (2011). Negative results are disappearing from most disciplines and countries. Scientometrics, 90, 891‑904.

Ferguson, C. J., & Heene, M. (2012). A vast graveyard of undead theories publication bias and psychological science’s aversion to the null. Perspectives on Psychological Science, 7(6), 555–561.

 

Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502-1505.

 

Franco, A., Malhotra, N., & Simonovits, G. (2016). Underreporting in Psychology Experiments: Evidence from a Study Registry. Social Psychological and Personality Science, 7, 8–12.

 

Fuchs, H. M., Jenny, M., & Fiedler, S. (2012). Psychologists are open to change, yet wary of rules. Perspectives on Psychological Science, 7, 639-642.

Giner-Sorolla, R. (2012) Science or art? How aesthetic standards grease the way through the publication bottleneck but undermine science. Perspectives on Psychological Science, 7, 562-571.

 

Greenwald, A. G. (1975). Consequences of prejudice against the null hypothesis. Psychological bulletin, 82(1), 1.

Kühberger, A., Fritz, A., & Scherndl, T. (2014). Publication bias in psychology: a diagnosis based on the correlation between effect size and sample size. PloS one, 9(9), e105825.

 

Ledgerwood, A., & Sherman, J. W. (2012). Short, Sweet, and Problematic? The

Rise of the Short Report in Psychological Science. Perspectives on Psych Science,

7, 60-66.

 

Mathieu, Sylvain, Isabelle Boutron, David Moher, Douglas G. Altman, and Philippe Ravaud. (2009). "Comparison of registered and published primary outcomes in randomized controlled trials", JAMA, Sept. 2, 2009, 302(9), 977-984.

 

Matosin, N., Frank, E., Engel, M., Lum, J. S., & Newell, K. A. (2014). Negativity towards negative results: a discussion of the disconnect between scientific worth and scientific culture.

Nissen, S. B., Magidson, T., Gross, K., & Bergstrom, C. T. (2016). Publication bias and the canonization of false facts. Elife, 5, e21451.

O’Boyle, E. H., Banks, G. C., & Gonzalez-Mulé, E. (2017). The Chrysalis Effect. Journal of

Management, 43(2), 376–399. http://doi.org/10.1177/0149206314527133

 

Pashler, H., & Harris, C. R. (2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7, 531‑536.

 

Peterson, D. (2016). The baby factory: Difficult research objects, disciplinary standards, and the production of statistical significance. Socius, 2, 2378023115625071.

 

Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society open science, 3(9), 160384.        

 

Tijdink, J. K., Verbeke, R., & Smulders, Y. M. (2014). Publication pressure and scientific misconduct in medical scientists. Journal of Empirical Research on Human Research Ethics, 9(5), 64-71.

The problems that plague us: procedural overfitting 

Carp, J. (2012). On the plurality of (methodological) worlds: estimating the analytic flexibility of fMRI experiments. Frontiers in Neuroscience, 6.

Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. Unpublished manuscript. http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf

Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D. (2015). The Extent and

Consequences of P-Hacking in Science. PLoS Biology, 13(3), e1002106.

http://doi.org/10.1371/journal.pbio.1002106

 

John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological science, 0956797611430953.

 

Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196-217

 

LeBel, E.P., & Peters, K.R. (2011). Fearing the future of empirical psychology: Bem’s (2011) evidence of psi as a case study in deficiencies in modal research practice. Review of General Psychology, 15, 371-379. https://etiennelebel.com/documents/l&p(2011,rgp).pdf

Murphy, K. R., & Aguinis, H. (2017). HARKing: How Badly Can Cherry-Picking and Question Trolling Produce Bias in Published Results?. Journal of Business and Psychology, 1-17.

Nieuwenhuis, S., Forstmann, B. U., & Wagenmakers, E. J. (2011). Erroneous analyses of interactions in neuroscience: a problem of significance. Nature neuroscience, 14(9), 1105-1107.

 

Sedlmeier, P. & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies? Psychological Bulletin, 105, 309-316.

 

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366.

 

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2017). False-Positive Citations. Perspectives on Psychological Science. https://papers.ssrn.com/sol3/Papers.cfm?abstract_id=2916240

 

Wicherts, J. M., Veldkamp, C. L., Augusteijn, H. E., Bakker, M., Van Aert, R., & Van Assen, M. A. (2016). Degrees of freedom in planning, running, analyzing, and reporting psychological studies: A checklist to avoid p-hacking. Frontiers in Psychology, 7, 1832.

The problems that plague us: quality control  

Brown, N. J., & Heathers, J. A. (2016). The GRIM test: A simple technique detects numerous anomalies in the reporting of results in psychology. Social Psychological and Personality Science, 1948550616673876. http://journals.sagepub.com/doi/pdf/10.1177/1948550616673876

Cizek, G. J. (2012). Defining and distinguishing validity: Interpretations of score meaning and justifications of test use. Psychological Methods, 17(1):31–43.

Ceci, S. J., & Peters, D. P. (1982). A Naturalistic Study of Peer Review in Psychology: The Fate of Published Articles, Resubmitted. Behavior and Brain Sciences, 5, 187-252.

 

Ioannidis, J. P. (2012). Why science is not necessarily self-correcting. Perspectives on Psychological Science, 7(6), 645-654.

 

Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 10, 584-585. http://science.sciencemag.org/content/355/6325/584

 

Nuijten, M. B., Hartgerink, C. H., van Assen, M. A., Epskamp, S., & Wicherts, J. M. (2016). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior research methods, 48(4), 1205-1226.

Peters, D. P., & Ceci, S. J. (1982). Peer-review practices of psychological journals: The fate of published articles, submitted again. The Behavioral And Brain Science, 5(187–255).

 

NHST, P-values, and the like 

 

Abelson, R. P. (1997). On the Surprising Longevity of Flogged Horses: Why There Is a Case for the Significance Test. Psychological Science, 8(1), 12–15.

 

Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., Wagenmakers, E. J., Berk, R., ... & Cesarini, D. (2017). Redefine statistical significance. Nature Human Behaviour, 1. https://www.nature.com/articles/s41562-017-0189-z

 

Carver, R.P. (1978). The case against statistical significance testing. Harvard Educational Review, 48, 378- 399.

 

Chavalarias, D., Wallach, J. D., Li, A. H. T., & Ioannidis, J. P. (2016). Evolution of reporting P values in the biomedical literature, 1990-2015. Jama, 315(11), 1141-1148.

 

Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45(12), 1304–1312.

 

Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997-1003.

 

Cowles, M., & Davis, C. (1982). On the origins of the .05 level of statistical significance. American Psychologist, 37, 553-558.

 

Cumming, G. (2014). The new statistics: Why and how. Psychological science, 25(1), 7-29.

doi:10.1038/s41562-017-0189-z

 

Fraley, R. C. & Marks, M.J. (2007). The null hypothesis significance-testing debate and its implications for personality research. In R.W. Robins, R.F. Krueger, & R.C. Fraley (eds.), Research Methods in Personality Psychology (Chap 9, 170-189). New York, NY: Guilford Press.

 

Frick, R. W. (1996). The appropriate use of null hypothesis testing. Psychological Methods, 1, 379-390.

 

Gelman, A. (2013). Commentary: P values and statistical practice. Epidemiology, 24(1), 69-72.

 

Gigerenzer, G. (2002). The superego, the ego, and the id in statistical reasoning. In G. Gigerenzer (Ed.), Adaptive thinking: Rationality in the real world. Oxford University Press.

 

Gigerenzer, G. (2004). Mindless statistics. Journal of Socio-Economics, 33, 587-606.

 

Gigerenzer, G., & Marewski, J. N. (2015). Surrogate science: The idol of a universal method for scientific inference. Journal of Management, 41(2), 421-440.

 

Gigerenzer, G., Krauss, S., & Vitouch, O. (2004). The null ritual. The Sage handbook of quantitative methodology for the social sciences, 391-408.

 

Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, p values, confidence intervals, and power: a guide to

misinterpretations. European Journal of Epidemiology, 31(4), 337–50.

http://doi.org/10.1007/s10654-016-0149-3

 

Greenwald, A. G., Gonzalez, R., Harris, R. J., & Guthrie, D. (1996). Effect sizes and p values: What should be reported and what should be replicated? Psychophysiology, 33, 175-183.

 

Greenwald, A.G. (1975). Consequences of prejudice against the null hypothesis. Psychological Bulletin, 82, 1-20.

 

Hagen, R. L. (1997). In praise of the null hypothesis statistical test. American Psychologist, 52, 15-24.

Halsey, L. G., Curran-Everett, D., Vowler, S. L., & Drummond, G. B. (2015). The fickle P value generates irreproducible results. Nature methods, 12(3), 179-185.

Harris, R. J. (1997). Significance tests have their place. Psychological Science, 8, 8-11.

 

Hauer, E. (2004).  The harm done by tests of significance. Accident Analysis and Prevention, 36, 495-500.

 

Johansson, T. (2011). Hail the impossible: p-values, evidence, and likelihood. Scandinavian Journal of Psychology, 52, 113-125.

Krueger, J. (2001). Null hypothesis significance testing: On the survival of a flawed method. American Psychologist, 56, 16-26.

 

Lakens, D. et al. (2017). Justify your alpha: A response to “redefine statistical significance”. DOI: 10.17605/OSF.IO.9S3Y6 (https://psyarxiv.com/9s3y6/)

 

Lykken, D. T. (1968). Statistical significance in psychological research. Psychological bulletin, 70(3p1), 151.

 

McCrary, J., Christensen, G., & Fanelli, D. (2016). Conservative tests under satisficing models of publication bias. PloS one, 11(2), e0149590.

McShane, B. B., & Gal, D. (2017). Statistical significance and the dichotomization of evidence. Journal of the American Statistical Association, 112(519), 885-895.

McShane, B. B., Gal, D., Gelman, A., Robert, C., & Tackett, J. L. (2017). Abandon statistical significance. arXiv preprint arXiv:1709.07588.

 

Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46, 806-834.

 

Nickerson, R. S. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241-301.

 

Nuzzo, R. (2014). Statistical errors: P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume. Nature, 506(7487), 150-152.

 

Pek, J., & Flora, D. B. (2017). Reporting effect sizes in original psychological research: A discussion and tutorial. Psychological Methods. http://doi.org/10.1037/met0000126

 

Rosnow, R.L., & Rosenthal, R. (1989). Statistical procedures and the justification of knowledge in psychological science. American Psychologist, 44, 1276-1284.

 

Rouder, J. N., Morey, R. D., Verhagen, J., Province, J. M., & Wagenmakers, E. J. (2016). Is there a free lunch in inference? Topics in Cognitive Science, 8, 520-547.

Sangnier, M., & Zylberberg, Y. (2016). Star wars: The empirics strike back. American Economic Journal: Applied Economics, 8(1), 1-32.

Savalei, V., & Dunn, E. (2015). Is the call to abandon p-values the red herring of the replicability crisis? Frontiers in Psychology, 6.

 

Schmidt, F. L. (1996). Statistical significance testing and cumulative knowledge in psychology: implications for training researchers. Psychological Methods, 1, 115-129.

 

Schmidt, J. E. H. F. L., Hunter, J. E., Harlow, L., Mulaik, S., & Steiger, J. (1997). Eight common but false objections to the discontinuation of significance testing in the analysis of research data. In L. L. Harlow, S. A. Mulaik, J. H. Steiger (Eds.), What if there were no significance tests?

Wainer, H. (1999). One cheer for null hypothesis significance testing. Psychological Methods, 6, 212-213.

 

Wasserstein, R. L., & Lazar, N. A. (2016). The ASA's statement on p-values: Context, process, and purpose, The American Statistician, 70:2, 129-133, DOI: 10.1080/00031305.2016.1154108

 

Wilkinson, L. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594-604.

Preregistration

 

Chambers, C. D. (2013). Registered reports: a new publishing initiative at Cortex. Cortex, 49(3), 609-610.

Chambers, C. D., Feredoes, E., Muthukumaraswamy, S. D., & Etchells, P. (2014). Instead of “playing the game” it is time to change the rules: Registered Reports at AIMS Neuroscience and beyond. AIMS Neuroscience, 1(1), 4–17.

 

Dal-Re, Rafael, John P. Ioannidis, Michael B. Bracken, Patricia A. Buffler, An-Wen Chan, Eduardo L. Franco, Carlo La Vecchia, Elisabete Weiderpass. (2014). “Making prospective registration of observational research a reality”, Science Translational Medicine, 6(224), 224cm1.

 

Humphreys, Macartan, Raul Sanchez de la Sierra, and Peter van der Windt. (2013). “Fishing, commitment, and communication: A proposal for comprehensive nonbinding research registration”, Political Analysis, 21, 1-20

 

Lin, W., & Green, D. P. (2016). Standard operating procedures: A safety net for pre-analysis plans. PS: Political Science & Politics, 49(3), 495-500.

Lindsay, D.S., Simons, D.J., Lilienfeld, S.O. (2016).  Research preregistration 101.  In the APS Observer, November 2016; https://www.psychologicalscience.org/observer/research-preregistration-101

Mellor, D. T., Nosek, B.A. (2018).  Easy pre-registration will benefit any research. Nature Human Behavior. doi:10.1038/s41562-018-0294-7

Nosek, B. A., & Lakens, D. (2014). Registered reports: A method to increase the credibility of published results. Social Psychology, 45(3), 137–141.

 

Nosek, B. A., Ebersole, C. R., DeHaven, A., & Mellor, D. (2017). The Preregistration Revolution. [preprint] https://osf.io/2dxu5/

 

Simons, D. J., Holcombe, A. O., & Spellman, B. A. (2014). An introduction to registered replication reports at perspectives on psychological science. Perspectives on Psychological Science, 9(5), 552–555.

van ‘t Veer, A.E., & Giner-Sorolla, R. (2016). Pre-registration in social psychology—A discussion and suggested template. Journal of Experimental Social Psychology, 67, 2-12. (http://www.sciencedirect.com/science/article/pii/S0022103116301925)

 

Wagenmakers, E. J., Beek, T., Dijkhoff, L., Gronau, Q. F., Acosta, A., Adams, R. B., ... & Bulnes, L. C. (2016). Registered Replication Report Strack, Martin, & Stepper (1988). Perspectives on Psychological Science, 1745691616674458.

 

Walster, G. W., & Cleary, T. A. (1970). A proposal for a new editorial policy in the social sciences. The American Statistician, 24(2), 16-19.

Power and power analysis 

Albers, C., & Lakens, D. (2017). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias. DOI: 10.17605/OSF.IO/B7Z4Q (https://psyarxiv.com/b7z4q/)

 

Anderson, S. F., Kelley, K., & Maxwell, S. E. (2017). Sample-size planning for more accurate

statistical power: A method adjusting sample effect sizes for publication bias and uncertainty. Psychological Science, 95679761772372.

http://doi.org/10.1177/0956797617723724

 

Bakker, M., et al. (2016). Researchers’ intuitions about power in psychological research. Psychological Science, 27, 1069-1077.

 

Bosco, F. A., Aguinis, H., Field, J. G., & Pierce, C. A. (2015). Correlational Effect Size Benchmarks. Journal of Applied Psychology, 100(2), 431–449.

 

Brand, A., Bradley, M. T., Best, L. A., & Stoica, G. (2008). Accuracy of effect size estimates from published psychological research. Perceptual and motor skills, 106(2), 645-649.

Brysbaert, M. and Stevens, M. (2018) Power Analysis and Effect Size in Mixed Effects Models: A Tutorial. Journal of Cognition, 1(1): 9, pp. 1–20, DOI: https://doi.org/10.5334/joc.10

Cohen, J. (1992). Statistical power analysis. Current Directions in Psychological Science, 1(3), 98–101.

Cohen, J. (1992). A power primer. Psychological bulletin, 112(1), 155.

 

Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25, 7-29.

 

Gelman, A., & Carlin, J. (2014). Beyond power calculations: Assessing Type S (sign) and Type M (magnitude) errors. Perspectives on Psychological Science, 9(6), 641-651.

 

Gervais et al. (2015). A powerful nudge? Presenting calculable consequences of underpowered research shifts incentives towards adequately powered designs. Social Psychological and Personality Science, 6, 847-854.

 

Gignac, G. E., & Szodorai, E. T. (2016). Effect size guidelines for individual differences researchers. Personality and Individual Differences, 102, 74–78.

 

Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C., Goodman, S. N., & Altman, D. G. (2016). Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. European journal of epidemiology, 1-14.

Judd, C. M., Westfall, J., & Kenny, D. A. (2017). Experiments with More Than One Random Factor: Designs, Analytic Models, and Statistical Power. Annual Review of Psychology, 68(1), 601–625. doi: 10.1146/annurev-psych-122414-033702

Kelley, K., & Rausch, J. R. (2006). Sample size planning for the standardized mean difference: accuracy in parameter estimation via narrow confidence intervals. Psychological methods, 11(4), 363.

Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4.

 

Lakens, D. (2014). Performing high‐powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701-710.

Maxwell, S. E. (2004). The persistence of underpowered studies in psychological research: causes, consequences, and remedies. Psychological methods, 9(2), 147.

 

Maxwell, S. E., Kelley, K., & Rausch, J. R. (2008). Sample Size Planning for Statistical Power and Accuracy in Parameter Estimation. Annual Review of Psychology, 59(1), 537–563.

McShane, B. B., & Böckenholt, U. (2014). You cannot step into the same river twice: When power analyses are optimistic. Perspectives on Psychological Science, 9(6), 612-625.

Paterson, T. A., Harms, P. D., Steel, P., & Credé, M. (2016). An assessment of the magnitude of effect sizes: Evidence from 30 years of meta-analysis in management. Journal of Leadership & Organizational Studies, 23(1), 66-81.

 

Pereira, T. V., Horwitz, R. I., & Ioannidis, J. P. (2012). Empirical evaluation of very large treatment effects of medical interventions. Jama, 308(16), 1676-1684.

 

Perugini, M., Gallucci, M., & Costantini, G. (2014). Safeguard power as a protection against imprecise power estimates. Perspectives on Psychological Science, 9, 319-332.

Reddan MC, Lindquist MA, Wager TD. (2017). Effect Size Estimation in Neuroimaging. JAMA Psychiatry. doi:10.1001/jamapsychiatry.2016.3356

Rosnow, R. L., & Rosenthal, R. (2009). Effect Sizes: Why, When, and How to Use Them.

Zeitschrift Für Psychologie / Journal of Psychology, 217(1), 6–14.

Schimmack, U. (2012). The ironic effect of significant results on the credibility of multiple-study articles. Psychological Methods, 17, 551-566.

 

Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47(5), 609–612.

 

Simonsohn, U. (2015). Small telescopes Detectability and the evaluation of replication results. Psychological Science, 26, 559–569. doi:10.1177/0956797614567341

Stukas, A. A., & Cumming, G. (2014). Interpreting effect sizes: Toward a quantitative cumulative social psychology. European Journal of Social Psychology, 44(7), 711-722.

Westfall, J., Kenny, D. A., & Judd, C. M. (2014). Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli. Journal of Experimental Psychology: General, 143(5), 2020.

On Replication 

Brandt, M. J., IJzerman, H., Dijksterhuis, A., Farach, F. J., Geller, J., Giner-Sorolla, R., Grange, J. A., et al. (2014). The Replication Recipe: What makes for a convincing replication? Journal of Experimental Social Psychology, 50, 217–224.

 

Braver, S.L., Thoemmes, F.J., & Rosenthal, R. (2014). Continuously cumulating meta-analysis and replicability. Perspectives on Psychological Science, 9, 333-342.

Cesario, J. (2014). Priming, replication, and the hardest science. Perspectives on Psychological Science, 9, 40-48.

 

Duncan, G. J., Engel, M., Claessens, A., & Dowsett, C. J. (2014). Replication and robustness in developmental research. Developmental psychology, 50(11), 2417.

 

Fetterman, A. K., & Sassenberg, K. (2015). The Reputational Consequences of Failed Replications and Wrongness Admission among Scientists. PloS One, 10(12), e0143723.

http://doi.org/10.1037/a0039400

 

Klein, R. A., Ratliff, K. A., Vianello, M., Adams Jr, R. B., Bahník, Š., Bernstein, M. J., ... & Cemalcilar, Z. (2014). Investigating variation in replicability. Social psychology.

 

Koole, S. L., & Lakens, D. (2012). Rewarding replications: A sure and simple way to improve psychological science. Perspectives on Psychological Science, 7(6), 608–614. doi:10.1177/1745691612462586

 

Kunert, R. (2016). Internal conceptual replications do not increase independent replication success. Psychonomic bulletin & review, 23(5), 1631-1638.

Hartshorne, J. K., & Schacher, A. (2012). Tracking replicability as a method of post-publication open evaluation. Frontiers in Computational Neuroscience. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3293145/pdf/fncom-06-00008.pdf

Hüffmeier, J., Manzei, J., & Schultze, T. (2016). Reconceptualizing replication as a sequence of different studies: A replication typology. Journal of Experimental Social Psychology, 66, 81-92.

LeBel, E. P., Berger, D., Campbell, L., & Loving, T. J. (2017). Falsifiability is not optional. Journal of Personality and Social Psychology, 113, 254-261. http://etiennelebel.com/documents/lbcl(2017,jpsp).pdf

Makel, M. C., & Plucker, J. A. (2014). Facts are more important than novelty: Replication in the education sciences. Educational Researcher, 43(6), 304-316.

 

Makel, M.C., Plucker, J.A., & Hegarty, B. (2012). Replications in psychology research: How often do they really occur? Perspectives on Psychological Science, 7, 537-542.

 

Matzke, D., Nieuwenhuis, S., van Rijn, H., Slagter, H. A., van der Molen, M. W., & Wagenmakers, E. J. (2015). The effect of horizontal eye movements on free recall: A preregistered adversarial collaboration. Journal of Experimental Psychology: General, 144(1), e1–e15. doi:10.1037/xge0000038

 

Maxwell, S. E., Lau, M. Y., & Howard, G. S. (2015). Is psychology suffering from a replication crisis? What does “failure to replicate” really mean? American Psychologist, 70(6), 487–498. http://doi.org/10.1037/a0039400

McElreath, R., & Smaldino, P. E. (2015). Replication, communication, and the population dynamics of scientific discovery. PLoS One, 10(8), e0136088.

Neuliep, J. W., & Crandall, R. (1990). Editorial bias against replication research. Journal of Social Behavior and Personality, 5, 85–90.

 

Neuliep, J. W., & Crandall, R. (1993). Reviewer bias against replication research. Journal of Social Behavior and Personality, 8, 21–29.

Pashler, H., & Harris, C. R. (2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7, 531‑536.

 

Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, 13(2), 90–100.

 

Simons, D. J. (2014). The value of direct replication. Perspectives on Psychological Science, 9(1), 76-80.

 

Simons, D.J., Shoda, Y., Lindsay, D. S. (in press). Constraints on Generality (COG) statements are needed to define direct replication. Commentary on Zwaan et al (in press).

Simonsohn, U. (2015). Small telescopes Detectability and the evaluation of replication results. Psychological Science, 26, 559–569. doi:10.1177/0956797614567341

Spence, J. R., & Stanley, D. J. (2016). Prediction Interval: What to Expect When You’re Expecting… A Replication. PloS one, 11(9), e0162874.

 

Stanley, D. J., & Spence, J. R. (2014). Expectations for replications: Are yours realistic?. Perspectives on Psychological Science, 9(3), 305-318.

 

Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Mass, H. L. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632–638. doi:10.1177/1745691612463078

Westfall, J., Judd, C. M., & Kenny, D. A. (2015). Replicating studies in which samples of participants respond to samples of stimuli. Perspectives on Psychological Science, 10(3), 390-399.

Zwaan, R.A., Etz, A., Lucas, R.E, Donnellan, M.B. (in press).  Making replication mainstream.  Behavior and Brain Sciences.


Open Science 

Aman, V. (2014). Is there any measurable benefit in publishing preprints in the arXiv section Quantitative Biology? arXiv:1411.1955 (https://arxiv.org/pdf/1411.1955.pdf)

 

Bergstrom, C.T., Foster, J.G., & Song, Y. (2017). Why Scientists Chase Big Problems: Individual Strategy and Social Optimality arXiv:1605.05822 [physics.soc-ph]: https://arxiv.org/pdf/1605.05822v2.pdf

Campbell, L., Loving, T. J., & LeBel, E. P. (2014). Enhancing transparency of the research process to increase accuracy of findings: A guide for relationship researchers. Personal Relationships, 21, 531-545. (http://etiennelebel.com/documents/cl&l(2014,pr).pdf)

 

Fecher, B., & Friesike, S. (2014). Open science: One term, five schools of thought. In S. Bartling & S. Friesike (Eds.), Opening Science. http://doi.org/10.1007/978-3-319-00026-8

 

Fraser, R., & Willison, D. (2009). Tools for de-identification of personal health information. Pan Canadian Health Information Privacy (HIP) Group.

 

Friedlin, F. J., & McDonald, C. J. (2008). A software tool for removing patient identifying information from clinical documents. Journal of the American Medical Informatics Association, 15, 601– 610. (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2528047/)

 

Goodman, Alyssa, et al. (2014). “Ten Simple Rules for the Care and Feeding of Scientific Data”, PLoS Computational Biology, 10(4), e1003542.

 

Heffetz, Ori, and Katrina Liggett. (2014). “Privacy and data-based research”, Journal of Economic Perspectives, 28(2), 75-98.

http://www.ehealthinformation.ca/wp-content/uploads/2014/08/2009-Tools-for-De-Identification-of-Personal-Health.pdf

 

McKiernan, E.C. et al. (2016). Point of view: How open science helps researchers succeed. eLife 2016;5:e16800 DOI: 10.7554/eLife.16800 (https://elifesciences.org/articles/16800) (Presentation of these ideas by E. McKiernan: https://www.youtube.com/watch?v=qFsc6rf8kOs)

Mellor, D. T., Vazire, S., & Lindsay, D. S. (2018, January 16). Transparent science: A more credible, reproducible, and publishable way to do science. Retrieved from psyarxiv.com/7wkdn

Miguel, E., C. Camerer, K. Casey, J. Cohen, K. M. Esterling, A. Gerber, R. Glennerster, D. P. Green, M. Humphreys, G. Imbens, D. Laitin, T. Madon, L. Nelson, B. A. Nosek, M. Petersen, R. Sedlmayr, J. P. Simmons, U. Simonsohn, M. Van der Laan. (2014). "Promoting Transparency in Social Science Research." Science, 10.1126/science.1245317.

 

Morey, R. D., Chambers, C. D., Etchells, P. J., Harris, C. R., Hoekstra, R., Lakens, D., … others. (2016). The Peer Reviewers’ Openness Initiative: incentivizing open research practices through peer review. Royal Society Open Science, 3(1), 150547.

 

Nosek, B. A., & Bar-Anan, Y. (2012). Scientific Utopia: I. Opening Scientific Communication. Psychological Inquiry, 23(3), 217–243.

 

Nosek, B. B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., … Yarkoni, T. (2015). Promoting an open research culture. Science, 348(6242), 1422–1425.

 

Piwowar, Heather A., and Todd J. Vision. (2013). “Data reuse and the open data citation advantage”, PeerJ, 1:e175. DOI 10.7717/peerj.175.

Poldrack, R. A., Baker, C. I., Durnez, J., Gorgolewski, K. J., Matthews, P. M., Munafò, M. R., ... & Yarkoni, T. (2017). Scanning the horizon: towards transparent and reproducible neuroimaging research. Nature Reviews Neuroscience, 18(2), 115-126.

 

Rouder, J.N. (2015). The what, why, and how of born-open data. Behavioral Research. DOI 10.3758/s13428-015-0630-z (https://link.springer.com/article/10.3758/s13428-015-0630-z)

 

Simonsohn, Uri. (2013). “Just Post it: The Lesson from Two Cases of Fabricated Data Detected by Statistics Alone,” Psychological Science, 24(10), 1875-1888.

Spellman, B., Gilbert, E. A., & Corker, K. S. (2017, September 20). Open Science: What, Why, and How. Retrieved from psyarxiv.com/ak6jr

Stodden, V. (2011). Trust your science? Open your data and code. https://web.stanford.edu/~vcs/papers/TrustYourScience-STODDEN.pdf

 

Sweeney, Latanya, Akua Abu, and Julia Winn. (2013). “Identifying Participants in the Personal Genome Project by Name.” [http://dataprivacylab.org/projects/pgp/1021-1.pdf]

Vanpaemel, W., Vermorgen, M., Deriemaecker, L., & Storms, G. (2015). Are we wasting a good crisis? The availability of psychological research data after the storm. Collabra, 1, 1-5. doi:10.1525/collabra.13

 

Vazire, S. (2017). Quality uncertainty erodes trust in science. Collabra: Psychology, 3(1). DOI: http://doi.org/10.1525/collabra.74  

Wicherts, J.E., & Crompvoets, E.A.V. (2017). The poor availability of syntaxes of structural equation modeling. Accountability in Research, 24:8, 458-468, DOI: 10.1090/08989621.2017.1396214 (http://www.tandfonline.com/doi/full/10.1080/08989621.2017.1396214)

 

Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61(7), 726.

Complexities in data availability

 

Gilmore, R., Kennedy, J., Adolph, K. (in press). Practical solutions for sharing data and materials from psychological research. Advances in Methods and Practices in Psychological Science.

Houtkoop, B., Bishop, D., Chambers, C., Macleod, M., Nichols, T., Wagenmakers, E.-J. (in press). Data sharing in psychology: A survey on barriers and preconditions. Advances in Methods and Practices in Psychological Science.

Joel, S., Eastwick, P., & Finkel, E. (in press). Open sharing of data on close relationships and other sensitive social psychological topics: Challenges, tools, and future directions. Advances in Methods and Practices in Psychological Science.

Levenstein, M., & Jared, L. (in press). Data: Sharing is caring. Advances in Methods and Practices in Psychological Science.

Meyer, M.N. (in press). Practical tips for ethical data sharing. Advances in Methods and Practices in Psychological Science.

Schönbrodt, F., Gollwitzer, M., & Abele-Brehm, A. (2017). Der Umgang mit Forschungsdaten im Fach Psychologie: Konkretisierung der DFG-Leitlinien. [Data Management in Psychological Science: Specification of the DFG Guidelines]. Psychologische Rundschau, 68, 20–35. doi:10.1026/0033-3042/a000341. English version available at: https://psyarxiv.com/vhx89/

Soderberg, C. (in press). Using the Open Science Framework to share data: A step-by-step guide. Advances in Methods and Practices in Psychological Science.

Walsh, C., Xia, W., Li, M., Denny, J., Harris, P., & Malin, B. (in press). Enabling open science initiatives in clinical psychology and psychiatry without sacrificing patient privacy: Current practices and future challenges. Advances in Methods and Practices in Psychological Science.

Informational value of existing research

Bakker, M., & Wicherts, J. M. (2011). The (mis)reporting of statistical results in psychology journals. Behavioral Research, 43, 666-678.

Bishop, D. V, & Thompson, P. A. (2015). Problems in using p-curve analysis and text-mining to detect rate of p-hacking. doi:10.7287/peerj.preprints.1266v3

Carter, E., Schönbrodt, F., Gervais, W. M., & Hilgard, J. (2017). Correcting for bias in psychology: A comparison of meta-analytic methods.

 

Dreber, A., Pfeiffer, T., Almenberg, J., Isaksson, S., Wilson, B., Chen, Y., ... & Johannesson, M. (2015). Using prediction markets to estimate the reproducibility of scientific research. Proceedings of the National Academy of Sciences, 112(50), 15343-15347.

 

Etz, A. & Vandekerckhove, J. (2016). A Bayesian perspective on the

Reproducibility Project: Psychology. PLoS ONE 11(2): e0149794. DOI: 10.1371/journal.pone.0149794. Open access.

Forstmeier, Wagenmakers, & Parker (2016). Detecting and avoiding likely false-positive findings: A practical guide. Biological Reviews. doi: 10.1111/brv.12315

 

Fraley, R.C., & Vazire, S. (2014). The N-pact factor: Evaluating the quality of empirical journals with respect to sample size and statistical power. PLOS-ONE.

 

Francis, G. (2014). The frequency of excess success for articles in Psychological Science. Psychonomic Bulletin and Review.

 

Francis, G., Tanzman, J., & Matthews, W. J. (2014). Excess success for psychology articles in the journal Science. PloS one, 9(12), e114255.

 

Gadbury, G. L., & Allison, D. B. (2012). Inappropriate fiddling with statistical analyses to obtain a desirable p-value: tests to detect its presence in published literature. PloS one, 7(10), e46363. http://doi.org/10.1177/0956797617723726

 

Ioannidis, J.P.A. & Trikalinos, T.A. (2007). An exploratory test for an excess of significant findings. Clinical Trials, 4, 245-253.

 

Lakens, D., & Etz, A. J. (2017). Too true to be bad: When sets of studies with significant and nonsignificant findings are probably true. Social Psychological and Personality Science, 8(8), 875-881.

 

Lakens, D., Hilgard, J., & Staaks, J. (2016). On the reproducibility of meta-analyses: Six practical recommendations. BMC psychology, 4(1), 24.

McShane, B. B., Böckenholt, U., & Hansen, K. T. (2016). Adjusting for Publication Bias in MetaAnalysis. Perspectives on Psychological Science, 11(5), 730–749. http://doi.org/10.1177/1745691616662243

 

Moreno, S. G., Sutton, A. J., Turner, E. H., Abrams, K. R., Cooper, N. J., Palmer, T. M., & Ades, A. E. (2009). Novel methods to deal with publication biases: secondary analysis of antidepressant trials in the FDA trial registry database and related journal publications. Bmj, 339, b2981.

 

Rohrer, J. M., Egloff, B., & Schmukle, S. C. (2017). Probing birth-order effects on narrow traits using specification-curve analysis. Psychological Science, 95679761772372.

 

Schimmack, U. (2012). The ironic effect of significant results on the credibility of multiple-study articles. Psychological Methods, 17, 551-566.

 

Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve: A key to the file-drawer. Journal of Experimental Psychology: General, 143(2), 534.

 

Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2015). Better P-curves: Making P-curve analysis more robust to errors, fraud, and ambitious P-hacking, a reply to Ulrich and Miller (2015). Journal of Experimental Psychology: General, 144(6), 1146-1152.

 

Stanley, T. D., & Doucouliagos, H. (2014). Meta‐regression approximations to reduce publication selection bias. Research Synthesis Methods, 5(1), 60-78.

Shimmack, U. (2014). The Test of Insufficient Variance (TIVA): A New Tool for the Detection of Questionable Research Practices. https://replicationindex.wordpress.com/2014/12/30/the-test-ofinsufficientvariance-tiva-a-new-tool-for-the-detection-ofquestionableresearch-practices/

 

Shimmack, U., & Brunner, J. (2017). Z-Curve: A Method for Estimating Replicability Based on Test Statistics in Original Studies.  https://replicationindex.files.wordpress.com/2017/11/z-curve-submission-draft.pdf

Van Elk, M., Matzke, D., Gronau, Q. F., Guan, M., Vandekerckhove, J., & Wagenmakers, E. J. (2015). Meta-analyses are no substitute for registered replications: A skeptical perspective on religious priming. Frontiers in Psychology, 6.

Solutions

Aguinis, H., Ramani, R., & Alabduljader, N. (2017). What you see is what you get? Enhancing methodological transparency in management research. Academy of Management Annals, annals-2016.

Asendorpf, J.B., Conner, M., De Fruyt, F., De ouwer, J., Denissen, J.J.A., Fiedler, K., Fiedler, S., Funder, D. C., Kliegl, R., Nosek, B. A., Perugini, M., Roberts, B. W., Schmitt, M., van Aken, M. A. G., Weber, H., & Wicherts, J. M. (2013). Recommendations for increasing replicability in psychology. European Journal of Personality, 27 (2), 108-119.

 

Brown, S. D., Furrow, D., Hill, D. F., Gable, J. C., Porter, L. P., & Jacobs, W. J. (2014). A duty to describe: Better the devil you know than the devil you don’t. Perspectives on Psychological Science, 9(6), 626-640.

Buttliere, B. T. (2014). Using science and psychology to improve the dissemination and evaluation of scientific work. Frontiers in computational neuroscience, 8.

Edwards, M. A., & Roy, S. (2016). Academic research in the 21st century: Maintaining scientific integrity in a climate of perverse incentives and hypercompetition. Environmental Engineering Science. DOI: 10.1089/ees.2016.0223 http://online.liebertpub.com/doi/pdfplus/10.1089/ees.2016.0223

 

Finkel, E.J., Eastwick, P.W., & Reis, H.T. (2014). Best research practices in psychology: Illustrating epistemological and pragmatic considerations with the case of relationship science. Journal of Personality and Social Psychology, 108, 275-297.

 

Frank, M.C. et al. (2017). A collaborative approach to infant research: Promoting reproducibility, best practices, and theory-building. Infancy, 22: 421–435. doi:10.1111/infa.12182 (http://onlinelibrary.wiley.com/doi/10.1111/infa.12182/abstract)

 

Funder, D.C., Levine, J.M., Mackie, D.M., Morf, C.C., Sansone, C., Vazire, S., & West, S.G. (2014). Improving the dependability of research in personality and social psychology: Recommendations for research and educational practice. Personality and Social Psychology Review, 18, 3-12.

 

Gerber, Alan, and Neil Malhotra. (2008). “Do statistical reporting standards affect what is published? Publication bias in two leading political science journals”, Quarterly Journal of Political Science, 3, 313-326.

Goh, J.X., Hall, J., Rosenthal, R. (in press) Mini Meta -Analysis of Your Own Studies: Some Arguments on Why and a Primer on How. Psychological Compass. https://osf.io/6tfh5/ Daily Technique: Mini

 

Hartshorne, J. K., & Schachner, A. (2012). Tracking replicability as a method of post-publication open evaluation. Frontiers in computational neuroscience, 6.

 

Ioannidis, J. P. (2014). How to make more published research true. PLoS medicine, 11(10), e1001747.

 

Kass, R. E., Caffo, B. S., Davidian, M., Meng, X. L., Yu, B., & Reid, N. (2016). Ten Simple Rules for Effective Statistical Practice. PLOS Comput Biol, 12(6), e1004961.

 

Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses: Sequential analyses. European Journal of Social Psychology, 44(7), 701–710.

 

Lakens, D., & Evers, E. R. K. (2014). Sailing From the Seas of Chaos Into the Corridor of Stability: Practical Recommendations to Increase the Informational Value of Studies. Perspectives on Psychological Science, 9(3), 278–292.

 

LeBel, E. P., Campbell, L., & Loving, T. J. (2017). Benefits of open and high-powered research outweigh costs. Journal of Personality and Social Psychology, 113, 230-243. http://etiennelebel.com/documents/lcl(2017,jpsp).pdf

 

LeBel, E. P., McCarthy, R., Earp, B., Elson, M., & Vanpaemel, W. (in press). A unified framework to quantify the credibility of scientific findings. Forthcoming at Advances in Methods and Practices in Psychological Science.  Retrieved from https://osf.io/preprints/psyarxiv/uwmr8

 

Lilienfeld, S. O. (2017). Psychology’s replication crisis and the grant culture: Righting the ship. Perspectives on psychological science, 12(4), 660-664.

 

Maner, J.K. (2014). Let’s put our money where our mouth is: If authors are to change their ways, reviewers (and editors) must change with them. Perspectives on Psychological Science, 9, 343- 351.

 

Miller, J., & Ulrich, R. (2016). Optimizing research payoff. Perspectives on Psychological Science, 11(5), 664-691.

 

Munafo, M. R., et al. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1, 0021. DOI: 10.0138/s41562-016-0021.

 

Murayama, K., Pekrun, R., & Fiedler, K. (2013). Research practices that can prevent an inflation of false positive rates. Personality and Social Psychology Review, 18, 107-118.

 

Nelson, L. D., Simmons, J. P., & Simonsohn, U. (2012). Let’s publish fewer papers. Psychological Inquiry: An International Journal for the Advancement of Psychological Theory, 23, 291-293.

Nelson, L. D., Simmons, J., & Simonsohn, U. (2018). Psychology's renaissance. Annual review of psychology, 69, 511-534.

Nichols, T. E., Das, S., Eickhoff, S. B., Evans, A. C., Glatard, T., Hanke, M., ... & Proal, E. (2017). Best practices in data analysis and sharing in neuroimaging using MRI. Nature Neuroscience, 20(3), 299-303.

Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7, 615‑631.

 

Open Science Collaboration (2017). Maximizing the reproducibility of your research. In S. O. Lilienfeld & I. D. Waldman (Eds.), Psychological Science Under Scrutiny: Recent Challenges and Proposed Solutions. New York, NY: Wiley. @ https://osf.io/xidvw/

 

Platt, J. R. (1964). Strong inference. science, 146(3642), 347-353.

 

Schaller, M. (2015). The empirical benefits of conceptual rigor: Systematic articulation of conceptual hypotheses can reduce the risk of non-replicable results (and facilitate novel discoveries too). Journal of Experimental Social Psychology.

 

Schweinsberg et al. (2016). The pipeline project: Pre-publication independent replications of a single laboratory’s research pipeline. Journal of Experimental Social Psychology.

ship. Perspectives on psychological science, 12(4), 660-664.

 

Shrout, P. E., & Rodgers, J. L. (2018). Psychology, science, and knowledge construction: Broadening perspectives from the replication crisis. Annual review of psychology, 69, 487-510.

Silberzahn, R., & Uhlmann, E.L. (2015). Many hands make tight work: Crowdsourcing research can balance discussions, validate findings and better inform policy. Nature, 526, 189-191. Full text: http://www.nature.com/news/crowdsourced-research-many-hands-make-tight-work-1.18508

 

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2012). A 21 word solution.

 

Simons, D. J., Shoda, Y., & Lindsay, D. S. (2017). Constraints on generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological Science, 174569161770863. http://doi.org/10.1177/1745691617708630

 

Spellman, B. A. (2015). A short (personal) future history of revolution 2.0. Perspectives on Psychological Science, 10, 886-899. http://pps.sagepub.com/content/10/6/886.full

 

Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing transparency through a multiverse analysis. Perspectives on Psychological Science, 11, 702-712. doi:10.1177/1745691616658637

Wicherts, J. M., Veldkamp, C. L., Augusteijn, H. E., Bakker, M., Van Aert, R. C., & Van Assen, M. A. (2016). Degrees of freedom in planning, running, analyzing, and reporting psychological studies: A checklist to avoid p-hacking. Frontiers in psychology, 7.

 

Wilkinson, L. (1999). Statistical methods in psychology journals: Guidelines and explanations. American psychologist, 54(8), 594.

 

Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12, 1100–1122. http://doi.org/10.1017/CBO9781107415324.004