Progress toward openness, transparency, and reproducibility in cognitive neuroscience
Corresponding Author
Rick O. Gilmore
Department of Psychology, the Pennsylvania State University, University Park, Pennsylvania
Address for correspondence: Rick O. Gilmore, Ph.D., Associate Professor, Department of Psychology, the Pennsylvania State University, University Park, PA 16802. [email protected]Search for more papers by this authorMichele T. Diaz
Department of Psychology, the Pennsylvania State University, University Park, Pennsylvania
Social, Life, & Engineering Sciences Imaging Center, the Pennsylvania State University, University Park, Pennsylvania
Search for more papers by this authorBrad A. Wyble
Department of Psychology, the Pennsylvania State University, University Park, Pennsylvania
Search for more papers by this authorCorresponding Author
Rick O. Gilmore
Department of Psychology, the Pennsylvania State University, University Park, Pennsylvania
Address for correspondence: Rick O. Gilmore, Ph.D., Associate Professor, Department of Psychology, the Pennsylvania State University, University Park, PA 16802. [email protected]Search for more papers by this authorMichele T. Diaz
Department of Psychology, the Pennsylvania State University, University Park, Pennsylvania
Social, Life, & Engineering Sciences Imaging Center, the Pennsylvania State University, University Park, Pennsylvania
Search for more papers by this authorBrad A. Wyble
Department of Psychology, the Pennsylvania State University, University Park, Pennsylvania
Search for more papers by this authorAbstract
Accumulating evidence suggests that many findings in psychological science and cognitive neuroscience may prove difficult to reproduce; statistical power in brain imaging studies is low and has not improved recently; software errors in analysis tools are common and can go undetected for many years; and, a few large-scale studies notwithstanding, open sharing of data, code, and materials remain the rare exception. At the same time, there is a renewed focus on reproducibility, transparency, and openness as essential core values in cognitive neuroscience. The emergence and rapid growth of data archives, meta-analytic tools, software pipelines, and research groups devoted to improved methodology reflect this new sensibility. We review evidence that the field has begun to embrace new open research practices and illustrate how these can begin to address problems of reproducibility, statistical power, and transparency in ways that will ultimately accelerate discovery.
References
- 1Bennett, C.M. & M.B. Miller. 2010. How reliable are the results from functional magnetic resonance imaging? Ann. N.Y. Acad. Sci. 1191: 133–155.
- 2Carp, J. 2012. The secret lives of experiments: methods reporting in the fMRI literature. NeuroImage 63: 289–300.
- 3Vul, E., C. Harris, P. Winkielman & H. Pashler. 2009. Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspect. Psychol. Sci. 4: 274–290.
- 4Biswal, B.B., M. Mennes, X.-N. Zuo, et al. 2010. Toward discovery science of human brain function. Proc. Natl. Acad. Sci. U.S.A. 107: 4734–4739.
- 5Fox, P.T., S. Mikiten, G. Davis & J.L. Lancaster. 1994. BrainMap: a database of human functional brain mapping. In Functional Neuroimaging: Technical Foundations. R.W. Thatcher, M. Hallett, T. Zeffiro, E.R. John & M. Huerta, Eds.: 95–105. Orlando: Academic Press, Inc.
- 6Van Horn, J.D. & M.S. Gazzaniga. 2002. Databasing fMRI studies—towards a “discovery science” of brain function. Nat. Rev. Neurosci. 3: 314–318.
- 7 Open Science Collaboration. 2015. Psychology. Estimating the reproducibility of psychological science. Science 349: aac4716.
- 8Button, K.S., J.P.A. Ioannidis, C. Mokrysz, et al. 2013. Power failure: why small sample size undermines the reliability of neuroscience. Nat. Rev. Neurosci. 14: 365–376.
- 9David, S.P., J.J. Ware, I.M. Chu, et al. 2013. Potential reporting bias in fMRI studies of the brain. PLoS One 8: e70104.
- 10Ioannidis, J.P.A., M.R. Munafò, P. Fusar-Poli, et al. 2014. Publication and other reporting biases in cognitive sciences: detection, prevalence, and prevention. Trends Cogn. Sci. 18: 235–241.
- 11Szucs, D. & J.P. Ioannidis. 2016. Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. bioRxiv 071530.
- 12Eklund, A., T.E. Nichols & H. Knutsson. 2016. Cluster failure: why fMRI inferences for spatial extent have inflated false-positive rates. Proc. Natl. Acad. Sci. U.S.A. 113: 7900–7905.
- 13Dolgin, E. 2010. This is your brain online: the functional connectomes project. Nat. Med. 16: 351.
- 14Van Essen, D.C., S.M. Smith, D.M. Barch, et al. 2013. The WU-Minn Human Connectome Project: an overview. NeuroImage 80: 62–79.
- 15Adolph, K.E., R.O. Gilmore, C. Freeman, et al. 2012. Toward open behavioral science. Psychol. Inq. 23: 244–247.
- 16Nosek, B.A. & Y. Bar-Anan. 2012. Scientific utopia: I. Opening scientific communication. Psychol. Inq. 23: 217–243.
- 17Pashler, H. & C.R. Harris. 2012. Is the replicability crisis overblown? Three arguments examined. Perspect. Psychol. Sci. 7: 531–536.
- 18Poldrack, R.A. & J.-B. Poline. 2015. The publication and reproducibility challenges of shared data. Trends Cogn. Sci. 19: 59–61.
- 19Munafò, M.R., B.A. Nosek, D.V.M. Bishop, et al. 2017. A manifesto for reproducible science. Nat. Hum. Behav. 1: 0021.
- 20Goodman, S.N., D. Fanelli & J.P.A. Ioannidis. 2016. What does research reproducibility mean? Sci. Transl. Med. 8: 3.
- 21Nosek, B.A., G. Alter, G.C. Banks, et al. 2015. Promoting an open research culture. Science 348: 1422–1425.
- 22Patil, P., R.D. Peng & J.T. Leek. 2016. What should researchers expect when they replicate studies? A statistical view of replicability in psychological science. Perspect. Psychol. Sci. 11: 539–544.
- 23Gilbert, D.T., G. King, S. Pettigrew & T.D. Wilson. 2016. Comment on “Estimating the reproducibility of psychological science.” Science 351: 1037.
- 24Etz, A. & J. Vandekerckhove. 2016. A Bayesian perspective on the reproducibility project: psychology. PLoS One 11: e0149794.
- 25Yarkoni, T., R.A. Poldrack, D.C. Van Essen & T.D. Wager. 2010. Cognitive neuroscience 2.0: building a cumulative science of human brain function. Trends Cogn. Sci. 14: 489–496.
- 26Yarkoni, T., R.A. Poldrack, T.E. Nichols, et al. 2011. Large-scale automated synthesis of human functional neuroimaging data. Nat. Methods 8: 665–670.
- 27Nosek, B.A., J.R. Spies & M. Motyl. 2012. Scientific utopia II. Restructuring incentives and practices to promote truth over publishability. Perspect. Psychol. Sci. 7: 615–631.
- 28Poldrack, R.A. & K.J. Gorgolewski. 2014. Making big data open: data sharing in neuroimaging. Nat. Neurosci. 17: 1510–1517.
- 29Poline, J.-B., J.L. Breeze, S.S. Ghosh, et al. 2012. Data sharing in neuroimaging research. Front. Neuroinform. 6: 9.
- 30Wicherts, J.M., M. Bakker & D. Molenaar. 2011. Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLoS One 6: e26828.
- 31Luck, S.J. 2014. An Introduction to the Event-Related Potential Technique. MIT Press.
- 32Delorme, A. & S. Makeig. 2004. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics. J. Neurosci. Methods 134: 9–21.
- 33Lopez-Calderon, J. & S.J. Luck. 2014. ERPLAB: an open-source toolbox for the analysis of event-related potentials. Front. Hum. Neurosci. 8: 213.
- 34Poldrack, R.A., D.M. Barch, J. Mitchell, et al. 2013. Toward open sharing of task-based fMRI data: the OpenfMRI project. Front. Neuroinform. 7: 12.
- 35 practiCal fMRI: the nuts & bolts. 2013. A checklist for fMRI acquisition methods reporting in the literature. Accessed February 21, 2017. https://practicalfmri.blogspot.com/2013/01/a-checklist-for-fmri-acquisition.html.
- 36Gold, S., B. Christian, S. Arndt, et al. 1998. Functional MRI statistical software packages: a comparative analysis. Hum. Brain Mapp. 6: 73–84.
- 37Morgan, V.L., B.M. Dawant, Y. Li & D.R. Pickens. 2007. Comparison of fMRI statistical software packages and strategies for analysis of images containing random and stimulus-correlated motion. Comput. Med. Imaging Graph. 31: 436–446.
- 38Strother, S.C., J. Anderson, L.K. Hansen, et al. 2002. The quantitative evaluation of functional neuroimaging experiments: the NPAIRS data analysis framework. NeuroImage 15: 747–771.
- 39Strother, S., S. La Conte, L. Kai Hansen, et al. 2004. Optimizing the fMRI data-processing pipeline using prediction and reproducibility performance metrics: I. A preliminary group analysis. NeuroImage 23(Suppl. 1): S196–S207.
- 40Carp, J. 2012. On the plurality of (methodological) worlds: estimating the analytic flexibility of fMRI experiments. Brain Imag. Methods 6: 149.
- 41Glatard, T., L.B. Lewis, R. Ferreira da Silva, et al. 2015. Reproducibility of neuroimaging analyses across operating systems. Front. Neuroinform. 9: 12.
- 42Turner, J.A. & A.R. Laird. 2011. The cognitive paradigm ontology: design and application. Neuroinformatics 10: 57–66.
- 43Poldrack, R.A., A. Kittur, D. Kalar, et al. 2011. The cognitive atlas: toward a knowledge foundation for cognitive neuroscience. Front. Neuroinform. 5: 17.
- 44Gershon, R.C., D. Cella, N.A. Fox, et al. 2010. Assessment of neurological and behavioural function: the NIH toolbox. Lancet Neurol. 9: 138–139.
- 45Hall, D., M.F. Huerta, M.J. McAuliffe & G.K. Farber. 2012. Sharing heterogeneous data: the National Database for Autism Research. Neuroinformatics 10: 331–339.
- 46Casey, B.J., J.D. Cohen, K. O'Craven, et al. 1998. Reproducibility of fMRI results across four institutions using a spatial working memory task. NeuroImage 8: 249–261.
- 47Friedman, L., H. Stern, G.G. Brown, et al. 2008. Test–retest and between-site reliability in a multicenter fMRI study. Hum. Brain Mapp. 29: 958–972.
- 48Friedman, L. & G.H. Glover. 2006. Report on a multicenter fMRI quality assurance protocol. J. Magn. Reson. Imaging 23: 827–839.
- 49Brown, G.G., D.H. Mathalon, H. Stern, et al. 2011. Multisite reliability of cognitive BOLD data. NeuroImage 54: 2163–2175.
- 50Braver, T.S., J.D. Cohen, L.E. Nystrom, et al. 1997. A parametric study of prefrontal cortex involvement in human working memory. NeuroImage 5: 49–62.
- 51Dosenbach, N.U.F., K.M. Visscher, E.D. Palmer, et al. 2006. A core system for the implementation of task sets. Neuron 50: 799–812.
- 52Duncan, J. 2010. The multiple-demand (MD) system of the primate brain: mental programs for intelligent behaviour. Trends Cogn. Sci. 14: 172–179.
- 53Smith, S.M., P.T. Fox, K.L. Miller, et al. 2009. Correspondence of the brain's functional architecture during activation and rest. Proc. Natl. Acad. Sci. U.S.A. 106: 13040–13045.
- 54Vigneau, M., V. Beaucousin, P.Y. Hervé, et al. 2006. Meta-analyzing left hemisphere language areas: phonology, semantics, and sentence processing. NeuroImage 30: 1414–1432.
- 55Yarkoni, T. & T.S. Braver. 2010. Cognitive neuroscience approaches to individual differences in working memory and executive control: conceptual and methodological issues. In Handbook of Individual Differences in Cognition. A. Gruszka, G. Matthews & B. Szymura, Eds.: 87–107. New York: Springer.
10.1007/978-1-4419-1210-7_6 Google Scholar
- 56Van Horn, J.D. & M.S. Gazzaniga. 2013. Why share data? Lessons learned from the fMRIDC. NeuroImage 82: 677–682.
- 57Van Horn, J.D., J.S. Grethe, P. Kostelec, et al. 2001. The Functional Magnetic Resonance Imaging Data Center (fMRIDC): the challenges and rewards of large-scale databasing of neuroimaging studies. Philos. Trans. R. Soc. Lond. B Biol. Sci. 356: 1323–1339.
- 58Mennes, M., B. Biswal, F.X. Castellanos & M.P. Milham. 2013. Making data sharing work: the FCP/INDI experience. NeuroImage 82: 683–691.
- 59Miller, M.B., C.-L. Donovan, J.D. Van Horn, et al. 2009. Unique and persistent individual patterns of brain activity across different memory retrieval tasks. NeuroImage 48: 625–635.
- 60Costafreda, S.G., M.J. Brammer, R.Z.N. Vêncio, et al. 2007. Multisite fMRI reproducibility of a motor task using identical MR systems. J. Magn. Reson. Imaging 26: 1122–1126.
- 61Duncan, K.J., C. Pattamadilok, I. Knierim & J.T. Devlin. 2009. Consistency and variability in functional localisers. NeuroImage 46: 1018–1026.
- 62Poldrack, R.A., C.I. Baker, J. Durnez, et al. 2017. Scanning the horizon: towards transparent and reproducible neuroimaging research. Nat. Rev. Neurosci. 18: 115–126.
- 63Thirion, B., P. Pinel, S. Mériaux, et al. 2007. Analysis of a large fMRI cohort: statistical and methodological issues for group analyses. NeuroImage 35: 105–120.
- 64Yarkoni, T. 2009. Big correlations in little studies: inflated fMRI correlations reflect low statistical power—commentary on Vul et al. (2009). Perspect. Psychol. Sci. 4: 294–298.
- 65Kriegeskorte, N., M.A. Lindquist, T.E. Nichols, et al. 2010. Everything you never wanted to know about circular analysis, but were afraid to ask. J. Cereb. Blood Flow Metab. 30: 1551–1557.
- 66Bennett, C., M. Miller & G. Wolford. 2009. Neural correlates of interspecies perspective taking in the post-mortem Atlantic salmon: an argument for multiple comparisons correction. NeuroImage 47: S125.
10.1016/S1053-8119(09)71202-9 Google Scholar
- 67Bennett, C.M., G.L. Wolford & M.B. Miller. 2009. The principled control of false positives in neuroimaging. Social Cogn. Affect. Neurosci. 4: 417–422.
- 68Woo, C.-W., A. Krishnan & T.D. Wager. 2014. Cluster-extent based thresholding in fMRI analyses: pitfalls and recommendations. NeuroImage 91: 412–419.
- 69Smith, S.M. & T.E. Nichols. 2009. Threshold-free cluster enhancement: addressing problems of smoothing, threshold dependence and localisation in cluster inference. NeuroImage 44: 83–98.
- 70Ioannidis, J.P.A. 2005. Why most published research findings are false. PLoS Med. 2: e124.
- 71Logothetis, N.K. 2008. What we can do and what we cannot do with fMRI. Nature 453: 869–878.
- 72Cole, D.M., S.M. Smith & C.F. Beckmann. 2010. Advances and pitfalls in the analysis and interpretation of resting-state FMRI data. Front. Syst. Neurosci. 4: 8.
- 73Poldrack, R.A. 2010. Subtraction and beyond: the logic of experimental designs for neuroimaging. In Foundational Issues in Human Brain Mapping. S.J. Hanson & M. Bunzl, Eds. MIT Press.
10.7551/mitpress/9780262014021.003.0014 Google Scholar
- 74Etzel, J.A., J.M. Zacks & T.S. Braver. 2013. Searchlight analysis: promise, pitfalls, and potential. NeuroImage 78: 261–269.
- 75Poldrack, R.A. 2006. Can cognitive processes be inferred from neuroimaging data? Trends Cogn. Sci. 10: 59–63.
- 76Peng, R.D. 2011. Reproducible research in computational science. Science 334: 1226–1227.
- 77Stodden, V. 2012. Reproducible research: tools and strategies for scientific computing. Comput. Sci. Eng. 14: 11–12.
- 78Sandve, G.K., A. Nekrutenko, J. Taylor & E. Hovig. 2013. Ten simple rules for reproducible computational research. PLoS Comput. Biol. 9: e1003285.
- 79Gorgolewski, K.J., T. Auer, V.D. Calhoun, et al. 2016. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci. Data 3: 160044.
- 80Gorgolewski, K.J., G. Varoquaux, G. Rivera, et al. 2015. NeuroVault.org: a web-based repository for collecting and sharing unthresholded statistical maps of the human brain. Front. Neuroinform. 9: 8.
- 81Gorgolewski, K.J., F. Alfaro-Almagro, T. Auer, et al. 2016. BIDS Apps: improving ease of use, accessibility and reproducibility of neuroimaging data analysis methods. bioRxiv 079145.
- 82Gilmore, R.O. & K.E. Adolph. Open sharing of research video—breaking down the boundaries of the research team. In Advancing Social and Behavioral Health Research through Cross-Disciplinary Team Science: Principles for Success. K. Hall, R. Croyle & A. Vogel, Eds. Springer. In press.
- 83Jann, K., D.G. Gee, E. Kilroy, et al. 2015. Functional connectivity in BOLD and CBF data: similarity and reliability of resting brain networks. NeuroImage 106: 111–122.
- 84Jovicich, J., M. Marizzoni, B. Bosch, et al. 2014. Multisite longitudinal reliability of tract-based spatial statistics in diffusion tensor imaging of healthy elderly subjects. NeuroImage 101: 390–403.
- 85Koolschijn, P.C.M.P., M.A. Schel, M. de Rooij, et al. 2011. A three-year longitudinal functional magnetic resonance imaging study of performance monitoring and test–retest reliability from childhood to early adulthood. J. Neurosci. 31: 4204–4212.
- 86Liao, X.-H., M.-R. Xia, T. Xu, et al. 2013. Functional brain hubs and their test–retest reliability: a multiband resting-state functional MRI study. NeuroImage 83: 969–982.
- 87Madhyastha, T., S. Mérillat, S. Hirsiger, et al. 2014. Longitudinal reliability of tract-based spatial statistics in diffusion tensor imaging. Hum. Brain Mapp. 35: 4544–4555.
- 88Marchitelli, R., L. Minati, M. Marizzoni, et al. 2016. Test–retest reliability of the default mode network in a multi-centric fMRI study of healthy elderly: effects of data-driven physiological noise correction techniques. Hum. Brain Mapp. 37: 2114–2132.
- 89Choe, A.S., C.K. Jones, S.E. Joel, et al. 2015. Reproducibility and temporal structure in weekly resting-state fMRI over a period of 3.5 years. PLoS One 10: e0140134.
- 90Poldrack, R.A., T.O. Laumann, O. Koyejo, et al. 2015. Long-term neural and physiological phenotyping of a single human. Nat. Commun. 6: 8885.
- 91Dubois, J. & R. Adolphs. 2016. Building a science of individual differences from fMRI. Trends Cogn. Sci. 20: 425–443.
- 92Boekel, W., E.-J. Wagenmakers, L. Belay, et al. 2015. A purely confirmatory replication study of structural brain–behavior correlations. Cortex 66: 115–133.
- 93Muhlert, N. & G.R. Ridgway. 2016. Failed replications, contributing factors and careful interpretations: commentary on Boekel et al., 2015. Cortex 74: 338–342.
- 94Glasser, M.F., T.S. Coalson, E.C. Robinson, et al. 2016. A multi-modal parcellation of human cerebral cortex. Nature 536: 171–178.
- 95Glasser, M.F., S.N. Sotiropoulos, J.A. Wilson, et al. 2013. The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage 80: 105–124.
- 96Zuo, X.-N., J.S. Anderson, P. Bellec, et al. 2014. An open science resource for establishing reliability and reproducibility in functional connectomics. Sci. Data 1: 140049.
- 97Nichols, T.E., S. Das, S.B. Eickhoff, et al. 2016. Best practices in data analysis and sharing in neuroimaging using MRI. bioRxiv 054262.
- 98Epskamp, S. & M.B. Nuijten. 2016. statcheck: extract statistics from articles and recompute p values. http://CRAN.R-project.org/package=statcheck.
- 99Nuijten, M.B., C.H.J. Hartgerink, M.A.L.M. van Assen, et al. 2015. The prevalence of statistical reporting errors in psychology (1985–2013). Behav. Res. Methods 48: 1–22.
- 100Simonsohn, U., L.D. Nelson & J.P. Simmons. 2014. P-curve: a key to the file-drawer. J. Exp. Psychol. Gen. 143: 534.
- 101Wager, T.D., M. Lindquist & L. Kaplan. 2007. Meta-analysis of functional neuroimaging data: current and future directions. Soc. Cogn. Affect. Neurosci. 2: 150–158.
- 102Nieuwenhuis, S., B.U. Forstmann & E.-J. Wagenmakers. 2011. Erroneous analyses of interactions in neuroscience: a problem of significance. Nat. Neurosci. 14: 1105–1107.
- 103Chen, G., Z.S. Saad, J.C. Britton, et al. 2013. Linear mixed-effects modeling approach to FMRI group analysis. NeuroImage 73: 176–190.
- 104Kriegeskorte, N., W.K. Simmons, P.S.F. Bellgowan & C.I. Baker. 2009. Circular analysis in systems neuroscience: the dangers of double dipping. Nat. Neurosci. 12: 535–540.
- 105Westfall, J., T. Nichols & T. Yarkoni. 2016. Fixing the stimulus-as-fixed-effect fallacy in task fMRI. bioRxiv 077131.
- 106Cox, R.W., R.C. Reynolds & P.A. Taylor. 2016. AFNI and clustering: false positive rates redux. bioRxiv 065862.
- 107Eickhoff, S.B., A.R. Laird, P.M. Fox, et al. 2016. Implementation errors in the GingerALE software: description and recommendations. Hum. Brain Mapp. 38: 7–11.
- 108Breiman, L. 2001. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat. Sci. 16: 199–231.
- 109Kay, K.N., T. Naselaris, R.J. Prenger & J.L. Gallant. 2008. Identifying natural images from human brain activity. Nature 452: 352–355.
- 110Nishimoto, S., A.T. Vu, T. Naselaris, et al. 2011. Reconstructing visual experiences from brain activity evoked by natural movies. Curr. Biol. 21: 1641–1646.
- 111Huth, A.G., W.A. de Heer, T.L. Griffiths, et al. 2016. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532: 453–458.
- 112Varoquaux, G., P.R. Raamana, D. Engemann, et al. 2017. Assessing and tuning brain decoders: cross-validation, caveats, and guidelines. NeuroImage 145: 166–179.
- 113Yarkoni, T. & J. Westfall. Choosing prediction over explanation in psychology: lessons from machine learning. Perspect. Psychol. Sci. In press.
- 114Cawley, G.C. & N.L.C. Talbot. 2010. On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res. 11: 2079–2107.
- 115Skocik, M., J. Collins, C. Callahan-Flintoft, et al. 2016. I tried a bunch of things: the dangers of unexpected overfitting in classification. bioRxiv 078816.
- 116Bandrowski, A.E. & M.E. Martone. 2016. RRIDs: a simple step toward improving reproducibility through rigor and transparency of experimental methods. Neuron 90: 434–436.
- 117Ascoli, G.A. 2006. The ups and downs of neuroscience shares. Neuroinformatics 4: 213–215.
- 118Gilmore, R.O. 2016. From big data to deep insight in developmental science. Wiley Interdiscip. Rev. Cogn. Sci. 7: 112–126.