Guidance framework to apply best practices in ecological data analysis: Lessons learned from building Galaxy-Ecology

This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.24072/pci.ecology.100694. This is version 5 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Coline Royaux , Jean-Baptiste Mihoub, Marie Jossé, Dominique Pelletier , Olivier Norvez, Yves Reecht , Anne Fouilloux, Helena Rasche, Saskia Hiltemann, Bérénice Batut, Marc Eléaume, Pauline Seguineau, Guillaume Massé, Alan Amossé, Claire Bissery, Romain Lorrilliere, Alexis Martin, Yves Bas, Thimothée Virgoulay, Valentin Chambon, Elie Arnaud, Elisa Michon, Clara Urfer, Eloïse Trigodet, Marie Delannoy, Gregoire Loïs, Romain Julliard, Björn Grüning, Yvan Le Bras

Abstract

Numerous conceptual frameworks exist for best practices in research data and analysis (e.g. Open Science and FAIR principles). In practice, there is a need for further progress to improve transparency, reproducibility, and confidence in ecology. Here, we propose a practical and operational framework for researchers and experts in ecology to achieve best practices for building analytical procedures from individual research projects to production-level analytical pipelines. We introduce the concept of atomisation to identify analytical steps which support generalisation by allowing us to go beyond single analyses. The term atomisation is  employed to convey the idea of single analytical steps as “atoms” composing an analytical procedure. When generalised, “atoms” can be used in more than a single case analysis. These guidelines were established during the development of the Galaxy-Ecology initiative, a web platform dedicated to data analysis in ecology. Galaxy-Ecology allows us to demonstrate a way to reach higher levels of reproducibility in ecological sciences by increasing the accessibility and reusability of analytical workflows once atomised and generalised.

DOI

https://doi.org/10.32942/X2G033

Subjects

Bioinformatics, Other Ecology and Evolutionary Biology

Keywords

biodiversity, Reproducible analyses, Galaxy, Good practices, Atomisation, Generalisation, workflows, ecoinformatics, Conda, container, Common Workflow Language, RO-CRATE

Dates

Published: 2024-04-11 14:35

Last Updated: 2024-10-09 00:47

Older Versions
License

CC BY Attribution 4.0 International

Additional Metadata

Language:
English

Data and Code Availability Statement:
Not applicable