We investigated whether statistical reporting inconsistencies could be avoided if journals implement the tool statcheck in the peer review process.
In a preregistered study covering over 7000 articles, we compared the inconsistency rates between two journals that implemented statcheck in their peer review process (Psychological Science and Journal of Experimental and Social Psychology) with two matched control journals (Journal of Experimental Psychology: General and Journal of Personality and Social Psychology, respectively), before and after statcheck was implemented.
Preregistered multilevel logistic regression analyses showed that the decrease in both inconsistencies and decision inconsistencies around p = .05 is considerably steeper in statcheck journals than in control journals, offering support for the notion that statcheck can be a useful tool for journals to avoid statistical reporting inconsistencies in published articles.
In December 2020, Willem Sleegers and I were awarded the Young eScientist Award from the Netherlands eScience Center for our proposal to improve statcheck’s searching algorithm. Today marks the start of our collaboration with the eScience Center and we are very excited to get started!
In this project, we plan to extend statcheck’s search algorithm with natural language processing algorithms, in order to recognize more statistics than just the ones reported perfectly in APA style (a current restriction). We hope that this extension will expand statcheck’s functionality beyond psychology, so that statistical errors in, e.g., biomedical and economics papers can also be detected and corrected.
More information about the award can be found here.
Recently, academic organisations in the Netherlands have been discussing how we can improve the system of Recognition and Rewards for scientists. In a short interview for Tilburg University, I explain my hope that rewarding Open Science can benefit both science and scientists.
I’m thrilled to announce that I won a €250,000 NWO Veni Grant for my 4-Step Robustness Check! The next 3 years I’ll be working on methods to assess and improve robustness of psychological science.
To check the robustness of a study could replicate it in a new sample. However, in my 4-Step Robustness Check, you first verify if the reported numbers in the original study are correct. If they’re not, they are not interpretable and you can’t compare them to the results of your replication.
Specifically, I advise researchers to do the following:
Check if there are visible errors in the reported numbers, for example by running a paper through my spellchecker for statistics: statcheck
Reanalyze the data following the original strategy to see if this leads to the same numbers
Check if the result is robust to alternative analytical choices
Perform a replication study in a new sample
This 4-step check provides an efficient framework to check if a study’s findings are robust. Note that the first steps take way less time than a full replication and might be enough to conclude a result is not robust.
The proposed framework can also be used as an efficient checklist for researchers to improve robustness of their own results:
Check the internal consistency of your reported results
Share your data and analysis scripts to facilitate reanalysis
Conduct and report your own sensitivity analyses
Write detailed methods sections and share materials to facilitate replication
Ultimately, I aim to create interactive, pragmatic, and evidence-based methods to improve and assess robustness, applicable to psychology and other fields.
I would like to wholeheartedly thank my colleagues, reviewers, and committee members for their time, feedback, and valuable insights. I’m looking forward to the next three years!
I am happy to announce that Robbie van Aert, Jelte Wicherts, and I received seed funding from the Herbert Simon Research Institute for our project to screen COVID-19 preprints for statistical inconsistencies.
Inconsistencies can distort conclusions, but even if inconsistencies are small, they negatively affect the reproducibility of a paper (i.e., where did a number come from?). Statistical reproducibility is a basic requirement for any scientific paper.
We plan to check a random sample of COVID-19 preprints from medRxiv and bioRxiv for several types of statistical inconsistencies. E.g., does a percentage match the accompanying fraction? Do the TP/TN/FP/FN rates match the reported sensitivity of a test?
We have 3 main objectives:
Post short reports with detected statistical inconsistencies underneath the preprint
Assess the prevalence of statistical inconsistencies in COVID-19 preprints
Compare the inconsistency-rate in COVID-19 preprints with the inconsistency-rate in similar preprints on other topics
We hypothesize that high time pressure may have led to a higher prevalence of statistical inconsistencies in COVID-19 preprints as opposed to preprints on less time sensitive issues.
We thank our colleagues at the Meta-Research Center for their feedback and help in developing the coding protocol.
I am happy to announce that our paper “Reproducibility of individual effect sizes in meta-analyses in psychology” was published in PLoS One (first-authored by Esther Maassen). In this study, we assessed 500 primary effect sizes from 33 psychology meta-analyses. Reproducibility was problematic in 45% of the cases (see Figure below for different causes). We strongly recommend meta-analysts to share their data and code.
I am very happy to announced that my paper “Practical tools and strategies for researchers to increase replicability” was listed as a Top Download for the journal Developmental Medicine & Child Neurology.
The paper lists an overview of concrete actions researchers can undertake to improve the openness, replicability, and overall robustness of their work.
I hope that the high number of downloads indicate that many researchers were able to cherry-pick open practices that worked for their situation.
On February 21st 2020, I gave an online talk for the Webcast Series on Transparency from Project TIER on how to efficiently assess and improve the robustness of scientific findings in four steps. The full talk can be found below. More details are posted on the Project TIER website. Also check out the other talks in these series here.
Last month, the QUEST center in Berlin organized the first METAxDATA meeting on building automated screening tools for data-driven meta-research. On the first night of the meeting, 13 researchers gave lightning talks about their tools. The clip below features my <2 minute lightning talk about statcheck.
All lightning talks were recorded and can be found here.