The COVID-19 outbreak has led to an exponential increase of publications and preprints about the virus, its causes, consequences, and possible cures. COVID-19 research has been conducted under high time pressure and has been subject to financial and societal interests. Doing research under such pressure may influence the scrutiny with which researchers perform and write up their studies. Either researchers become more diligent, because of the high-stakes nature of the research, or the time pressure may lead to cutting corners and lower quality output.
In this study, we conducted a natural experiment to compare the prevalence of incorrectly reported statistics in a stratified random sample of COVID-19 preprints and a matched sample of non-COVID-19 preprints.
Our results show that the overall prevalence of incorrectly reported statistics is 9-10%, but frequentist as well as Bayesian hypothesis tests show no difference in the number of statistical inconsistencies between COVID-19 and non-COVID-19 preprints.
Taken together with previous research, our results suggest that the danger of hastily conducting and writing up research lies primarily in the risk of conducting methodologically inferior studies, and perhaps not in the statistical reporting quality.
We investigated whether statistical reporting inconsistencies could be avoided if journals implement the tool statcheck in the peer review process.
In a preregistered study covering over 7000 articles, we compared the inconsistency rates between two journals that implemented statcheck in their peer review process (Psychological Science and Journal of Experimental and Social Psychology) with two matched control journals (Journal of Experimental Psychology: General and Journal of Personality and Social Psychology, respectively), before and after statcheck was implemented.
Preregistered multilevel logistic regression analyses showed that the decrease in both inconsistencies and decision inconsistencies around p = .05 is considerably steeper in statcheck journals than in control journals, offering support for the notion that statcheck can be a useful tool for journals to avoid statistical reporting inconsistencies in published articles.
In December 2020, Willem Sleegers and I were awarded the Young eScientist Award from the Netherlands eScience Center for our proposal to improve statcheck’s searching algorithm. Today marks the start of our collaboration with the eScience Center and we are very excited to get started!
In this project, we plan to extend statcheck’s search algorithm with natural language processing algorithms, in order to recognize more statistics than just the ones reported perfectly in APA style (a current restriction). We hope that this extension will expand statcheck’s functionality beyond psychology, so that statistical errors in, e.g., biomedical and economics papers can also be detected and corrected.
More information about the award can be found here.
Recently, academic organisations in the Netherlands have been discussing how we can improve the system of Recognition and Rewards for scientists. In a short interview for Tilburg University, I explain my hope that rewarding Open Science can benefit both science and scientists.
I’m thrilled to announce that I won a €250,000 NWO Veni Grant for my 4-Step Robustness Check! The next 3 years I’ll be working on methods to assess and improve robustness of psychological science.
To check the robustness of a study could replicate it in a new sample. However, in my 4-Step Robustness Check, you first verify if the reported numbers in the original study are correct. If they’re not, they are not interpretable and you can’t compare them to the results of your replication.
Specifically, I advise researchers to do the following:
Check if there are visible errors in the reported numbers, for example by running a paper through my spellchecker for statistics: statcheck
Reanalyze the data following the original strategy to see if this leads to the same numbers
Check if the result is robust to alternative analytical choices
Perform a replication study in a new sample
This 4-step check provides an efficient framework to check if a study’s findings are robust. Note that the first steps take way less time than a full replication and might be enough to conclude a result is not robust.
The proposed framework can also be used as an efficient checklist for researchers to improve robustness of their own results:
Check the internal consistency of your reported results
Share your data and analysis scripts to facilitate reanalysis
Conduct and report your own sensitivity analyses
Write detailed methods sections and share materials to facilitate replication
Ultimately, I aim to create interactive, pragmatic, and evidence-based methods to improve and assess robustness, applicable to psychology and other fields.
I would like to wholeheartedly thank my colleagues, reviewers, and committee members for their time, feedback, and valuable insights. I’m looking forward to the next three years!
I am happy to announce that Robbie van Aert, Jelte Wicherts, and I received seed funding from the Herbert Simon Research Institute for our project to screen COVID-19 preprints for statistical inconsistencies.
Inconsistencies can distort conclusions, but even if inconsistencies are small, they negatively affect the reproducibility of a paper (i.e., where did a number come from?). Statistical reproducibility is a basic requirement for any scientific paper.
We plan to check a random sample of COVID-19 preprints from medRxiv and bioRxiv for several types of statistical inconsistencies. E.g., does a percentage match the accompanying fraction? Do the TP/TN/FP/FN rates match the reported sensitivity of a test?
We have 3 main objectives:
Post short reports with detected statistical inconsistencies underneath the preprint
Assess the prevalence of statistical inconsistencies in COVID-19 preprints
Compare the inconsistency-rate in COVID-19 preprints with the inconsistency-rate in similar preprints on other topics
We hypothesize that high time pressure may have led to a higher prevalence of statistical inconsistencies in COVID-19 preprints as opposed to preprints on less time sensitive issues.
We thank our colleagues at the Meta-Research Center for their feedback and help in developing the coding protocol.
I am happy to announce that our paper “Reproducibility of individual effect sizes in meta-analyses in psychology” was published in PLoS One (first-authored by Esther Maassen). In this study, we assessed 500 primary effect sizes from 33 psychology meta-analyses. Reproducibility was problematic in 45% of the cases (see Figure below for different causes). We strongly recommend meta-analysts to share their data and code.