Preprints
Filtering by Subject: Bioinformatics
AI and Big Data for invasion biology: finding, modelling and forecasting the population dynamics of invaders
Published: 2025-01-07
Subjects: Biodiversity, Bioinformatics, Life Sciences, Population Biology
Artificial intelligence (AI) is rapidly transforming the study and management of invasive species through analytical and predictive tools that optimize detection, monitoring, and automated eradication. In this work, we reviewed the fundamental principles of machine learning and deep learning, illustrated with recent case studies on invasive species. We also present the first systematic review of [...]
A curated benchmark dataset for molecular identification based on genome skimming
Published: 2024-12-19
Subjects: Biodiversity, Bioinformatics, Ecology and Evolutionary Biology, Genetics and Genomics, Life Sciences
Genome skimming is a promising sequencing strategy for DNA-based taxonomic identification. However, the lack of standardized datasets for benchmarking genome skimming tools presents a challenge in comparing new methods to existing ones. As part of the development of varKoder, a new tool for DNA-based identification, we curated four datasets designed for comparing molecular identification tools [...]
Gut microbiome composition and function – including transposase gene abundance - varies with age, but not senescence, in a wild vertebrate
Published: 2024-11-14
Subjects: Bioinformatics, Ecology and Evolutionary Biology, Life Sciences, Microbiology, Ornithology
Studies on wild animals, mostly undertaken using 16S metabarcoding, have yielded ambigous evidence regarding changes in the gut microbiome (GM) with age and senescence. Furthermore, variation in GM function has rarely been studied in such wild populations, despite GM metabolic characteristics potentially being associated with host senescent declines. Here, we used seven years of longitudinal [...]
FAIRification of DMRichR Pipeline: Advancing Epigenetic Research on Environmental and Evolutionary Model Organisms
Published: 2024-11-05
Subjects: Bioinformatics, Computational Biology, Environmental Public Health, Life Sciences, Other Ecology and Evolutionary Biology
Bioinformatics tools often prioritize humans or human-related model organisms, overlooking the requirements of environmentally relevant species, which limits their use in ecological research. This gap is particularly challenging when implementing existing software, as inadequate documentation can delay the innovative use of environmental models for modern risk assessment of chemicals that can [...]
Navigating phylogenetic conflict and evolutionary inference in plants with target capture data
Published: 2024-05-27
Subjects: Bioinformatics, Biology, Botany, Ecology and Evolutionary Biology, Evolution, Life Sciences, Plant Sciences, Research Methods in Life Sciences
Target capture has quickly become a preferred approach for plant systematic and evolutionary research, marking a step-change in the generation of data for phylogenetic inference. While this advancement has facilitated the resolution of many relationships, phylogenetic conflict continues to be reported, and often attributed to genome duplication, reticulation, incomplete lineage sorting or rapid [...]
A minimum data standard for wildlife disease research and surveillance
Published: 2024-05-19
Subjects: Animal Diseases, Biodiversity, Bioinformatics, Diseases, Ecology and Evolutionary Biology, Environmental Microbiology and Microbial Ecology Life Sciences, Life Sciences, Microbiology, Parasitic Diseases, Veterinary Infectious Diseases, Veterinary Medicine, Virology, Virus Diseases
Rapid and comprehensive data sharing is vital to the transparency and actionability of wildlife infectious disease research and surveillance. Unfortunately, most best practices for publicly sharing these data are focused on pathogen determination and genetic sequence data. Other facets of wildlife disease data – particularly negative results – are often withheld or, at best, summarized in a [...]
Guidance framework to apply best practices in ecological data analysis: Lessons learned from building Galaxy-Ecology
Published: 2024-04-11
Subjects: Bioinformatics, Other Ecology and Evolutionary Biology
Numerous conceptual frameworks exist for best practices in research data and analysis (e.g. Open Science and FAIR principles). In practice, there is a need for further progress to improve transparency, reproducibility, and confidence in ecology. Here, we propose a practical and operational framework for researchers and experts in ecology to achieve best practices for building analytical [...]
Incomplete lineage sorting and hybridization underlie of tree discordance in Petunia and related genera (Petunieae, Solanaceae)
Published: 2024-03-29
Subjects: Biodiversity, Bioinformatics, Botany, Genomics
Despite the overarching history of species divergence, phylogenetic studies often reveal distinct topologies across regions of the genome. The sources of these gene tree discordances are variable, but incomplete lineage sorting (ILS) and hybridization are among those with the most biological importance. Petunia serves as a classic system for studying hybridization in the wild. While field studies [...]
Ten Simple Rules to build a Model Life Cycle
Published: 2024-02-09
Subjects: Bioinformatics, Ecology and Evolutionary Biology, Software Engineering
Just like data, models have their own life cycle. By recognizing how one’s model fits within the life cycle of the data (or at least, ensuring that the model life cycle is understood), we can identify opportunities to foster new collaborations, encourage better practices in data analysis, and ultimately accelerate research. In this manuscript, we introduce the Model Life Cycle and develop a [...]
A composite universal DNA signature for the Tree of Life
Published: 2024-01-18
Subjects: Bioinformatics, Computational Biology, Genomics, Other Ecology and Evolutionary Biology
Species identification using DNA barcodes has revolutionized biodiversity sciences.However, conventional barcoding methods may lack power and universal applicabilityacross the Tree of Life. Alternative methods based on whole genome sequencing are hardto scale due to large data requirements. Here, we develop a novel DNA-based identificationmethod, varKoding, using exceptionally low-coverage genome [...]
A big data and machine learning approach for monitoring the condition of ecosystems
Published: 2024-01-16
Subjects: Applied Statistics, Biodiversity, Bioinformatics, Earth Sciences, Ecology and Evolutionary Biology, Environmental Sciences, Forest Biology, Forest Sciences, Life Sciences, Other Ecology and Evolutionary Biology, Physical Sciences and Mathematics, Statistical Methodology, Statistical Models, Terrestrial and Aquatic Ecology
Ecosystems are highly valuable as a source of goods and services and as a heritage for future generations. Knowing their condition is extremely important for all management and conservation activities and public policies. Until now, the evaluation of ecosystem condition has been unsatisfactory and thus lacks practical implementation for most countries. We propose that ecosystem integrity is a [...]
otb: Creating a HiC/HiFi Pipeline to Assemble the Prosapia bicincta Genome
Published: 2023-12-05
Subjects: Agriculture, Bioinformatics, Computational Biology, Genomics, Other Animal Sciences
The implementation of a new genomic assembly pipeline named only the best [Genome Assembly Tools] (otb) has effectively addressed various challenges associated with data management during the development and storage of genome assemblies. otb, which incorporates a comprehensive pipeline involving a setup layer, quality checks, templating, and the integration of Nextflow and Singularity. The [...]
Towards causal relationships for modelling species distribution
Published: 2023-10-14
Subjects: Biodiversity, Bioinformatics, Life Sciences, Natural Resources and Conservation, Statistical Models
1. Understanding the processes underlying the distribution of species through space and time is fundamental in several research fields spanning from ecology to spatial epidemiology. Correlative species distribution models (SDMs) involve popular statistical tools to infer species geographical distribution thanks to spatiotemporally explicit observations of species occurrences coupled with a set of [...]
Best practices for genetic and genomic data archiving
Published: 2023-09-25
Subjects: Bioinformatics, Biology, Ecology and Evolutionary Biology, Genetics and Genomics, Life Sciences
Genetic and genomic data are collected for a vast array of scientific and applied purposes. Despite mandates for public archiving, data are typically used only by the generating authors. The reuse of genetic and genomic datasets remains uncommon because it is difficult, if not impossible, due to non-standard archiving practices and lack of contextual metadata. But as the new field of [...]
CasPEDIA Database: A Functional Classification System for Class 2 CRISPR-Cas Enzymes
Published: 2023-08-17
Subjects: Biochemistry, Biophysics, and Structural Biology, Bioinformatics, Life Sciences
CRISPR-Cas enzymes enable RNA-guided bacterial immunity and are widely used for biotechnological applications including genome editing. In particular, the Class 2 CRISPR-associated enzymes (Cas9, Cas12 and Cas13 families), have been deployed for numerous research, clinical and agricultural applications. However, the immense genetic and biochemical diversity of these proteins in the public domain [...]