This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
The SORTEE Guidelines for Data and Code Quality Control in Ecology and Evolutionary Biology
Downloads
Authors
Abstract
Open data and code are crucial to increasing transparency and reproducibility, and in building trust in scientific research. However, despite an increasing number of journals in ecology and evolutionary biology mandating for data and code to be archived alongside published articles, the amount and quality of archived data and code, and subsequent reproducibility of results, has remained worryingly low. As a result, a handful of journals have recruited dedicated data editors, whose role is to help authors increase the overall quality of archived data and code. There is, however, a general lack of consensus of what a data editor should check, how to do it, and to what level of detail, and the process is often vague and hidden from readers and authors alike. Here, with the input from multiple data editors across several journals in ecology and evolutionary biology, we establish and describe the first standardised guidelines for Data and Code Quality Control on behalf of the Society for Open, Reliable, and Transparent Ecology and Evolutionary Biology (SORTEE). We start by introducing the concept of a data editor and data and code quality control, what is expected from data and code quality control, the relative costs and benefits to journals, authors, and readers, and then introduce and detail the SORTEE-led guidelines, ending with advice for journals and authors. We believe that by adopting these standardised guidelines, journals will help increase the consistency and transparency of the data editor process for readers, authors, and data editors.
DOI
https://doi.org/10.32942/X24P8S
Subjects
Ecology and Evolutionary Biology
Keywords
data sharing, code sharing, computational reproducibility, open science, Data Re-use, Methodological Rigor, FAIR principles, transparency, Data editor
Dates
Published: 2025-08-15 01:36
Last Updated: 2025-08-15 01:36
License
CC BY Attribution 4.0 International
Additional Metadata
Conflict of interest statement:
SORTEE has been financially supported by Dryad, Figshare, the Center of Open Science (which hosts the Open Science Framework; OSF), Peer Community In, the American Society of Naturalists and the Royal Society, all of which are mentioned in the guidelines. EIC is the 2025 president of SORTEE. EIC, ML, MP, AST are on SORTEE board of directors. JLP, KBN, CJ, SN, and EIC are members of the SORTEE advocacy committee. JLP, BJA, KBN, JAB, BC, DG, CJ, RK, ML, SN, ROD, MP, QP, AST, NvD, and EIC are SORTEE members. BB is a data editor at the American Naturalist. EIC, AST, ROD, NvD, MJG, TD, EF, PDA, and QP are data editors at Ecology Letters. BJA, JAB, DG, DSM and LW are data editors at Proceedings B. SL is the data editor at Journal of Evolutionary Biology. EFJ is the data editor from Behavioural Ecology and Sociobiology. BC, RK, ML, and MP are data editors at PCI.
Data and Code Availability Statement:
Not applicable
Language:
English
Comment #242 Christelle Dantec @ 2025-10-04 20:06
We can only welcome this type of guideline, which aims to promote best practices and increasingly emphasize open data and reproducibility.
However, the proposal to open up data and codes by depositing them in a repository that uses a business model based on Data Publishing Charges (Dryad) raises questions.
Zenodo, on the other hand, has only a limited number of mandatory metadata fields, which can hinder the reproducibility and reusability of datasets.
On the other hand, recommending repositories with a large amount of metadata and curation would ensure better reproducibility, greater trust, and a better understanding of the data.
For codes, we could also mention Software Heritage?