Preprints
Filtering by Subject: Bioinformatics
A dataset for benchmarking molecular identification tools based on genome skimming
Published: 2024-12-19
Subjects: Biodiversity, Bioinformatics, Ecology and Evolutionary Biology, Genetics and Genomics, Life Sciences
Genome skimming is an emerging tool allowing for scalable DNA barcoding efforts for numerous biodiversity science applications. Despite its growing importance, there are few standardized datasets for benchmarking genome skimming tools, making it challenging to evaluate new methods (e.g., using machine learning), and comparing to existing ones (e.g., conventional barcoding loci derived from [...]
Gut microbiome composition and function – including transposase gene abundance - varies with age, but not senescence, in a wild vertebrate
Published: 2024-11-14
Subjects: Bioinformatics, Ecology and Evolutionary Biology, Life Sciences, Microbiology, Ornithology
Studies on wild animals, mostly undertaken using 16S metabarcoding, have yielded ambigous evidence regarding changes in the gut microbiome (GM) with age and senescence. Furthermore, variation in GM function has rarely been studied in such wild populations, despite GM metabolic characteristics potentially being associated with host senescent declines. Here, we used seven years of longitudinal [...]
FAIRification of DMRichR Pipeline: Advancing Epigenetic Research on Environmental and Evolutionary Model Organisms
Published: 2024-11-05
Subjects: Bioinformatics, Computational Biology, Environmental Public Health, Life Sciences, Other Ecology and Evolutionary Biology
Bioinformatics tools often prioritize humans or human-related model organisms, overlooking the requirements of environmentally relevant species, which limits their use in ecological research. This gap is particularly challenging when implementing existing software, as inadequate documentation can delay the innovative use of environmental models for modern risk assessment of chemicals that can [...]
Navigating phylogenetic conflict and evolutionary inference in plants with target capture data
Published: 2024-05-27
Subjects: Bioinformatics, Biology, Botany, Ecology and Evolutionary Biology, Evolution, Life Sciences, Plant Sciences, Research Methods in Life Sciences
Target capture has quickly become a preferred approach for plant systematic and evolutionary research, marking a step-change in the generation of data for phylogenetic inference. While this advancement has facilitated the resolution of many phylogenetic relationships, phylogenetic conflict continues to be reported, and often attributed to genome duplication, reticulation, deep coalescence or [...]
A minimum data standard for wildlife disease studies
Published: 2024-05-19
Subjects: Animal Diseases, Biodiversity, Bioinformatics, Diseases, Ecology and Evolutionary Biology, Environmental Microbiology and Microbial Ecology Life Sciences, Life Sciences, Microbiology, Parasitic Diseases, Veterinary Infectious Diseases, Veterinary Medicine, Virology, Virus Diseases
Thousands of scientists and practitioners conduct research on infectious diseases of wildlife. Rapid and comprehensive data sharing is vital to the transparency and actionability of their work, but unfortunately, most efforts designed to publically share these data are focused on pathogen determination and genetic sequence data. Other facets of existing surveillance data – particularly [...]
Guidance framework to apply best practices in ecological data analysis: Lessons learned from building Galaxy-Ecology
Published: 2024-04-11
Subjects: Bioinformatics, Other Ecology and Evolutionary Biology
Numerous conceptual frameworks exist for best practices in research data and analysis (e.g. Open Science and FAIR principles). In practice, there is a need for further progress to improve transparency, reproducibility, and confidence in ecology. Here, we propose a practical and operational framework for researchers and experts in ecology to achieve best practices for building analytical [...]
Incomplete lineage sorting and hybridization underlie of tree discordance in Petunia and related genera (Petunieae, Solanaceae)
Published: 2024-03-29
Subjects: Biodiversity, Bioinformatics, Botany, Genomics
Despite the overarching history of species divergence, phylogenetic studies often reveal distinct topologies across regions of the genome. The sources of these gene tree discordances are variable, but incomplete lineage sorting (ILS) and hybridization are among those with the most biological importance. Petunia serves as a classic system for studying hybridization in the wild. While field studies [...]
Ten Simple Rules to build a Model Life Cycle
Published: 2024-02-09
Subjects: Bioinformatics, Ecology and Evolutionary Biology, Software Engineering
Just like data, models have their own life cycle. By recognizing how one’s model fits within the life cycle of the data (or at least, ensuring that the model life cycle is understood), we can identify opportunities to foster new collaborations, encourage better practices in data analysis, and ultimately accelerate research. In this manuscript, we introduce the Model Life Cycle and develop a [...]
A universal DNA signature for the Tree of Life
Published: 2024-01-18
Subjects: Bioinformatics, Computational Biology, Genomics, Other Ecology and Evolutionary Biology
Species identification using DNA barcodes has revolutionized biodiversity sciences and society at large. However, conventional barcoding methods may lack power and universal applicability across the Tree of Life. Alternative methods based on whole genome sequencing are hard to scale due to large data requirements. Here, we develop a novel DNA-based identification method, varKoding, using [...]
A big data and machine learning approach for monitoring the condition of ecosystems
Published: 2024-01-16
Subjects: Applied Statistics, Biodiversity, Bioinformatics, Earth Sciences, Ecology and Evolutionary Biology, Environmental Sciences, Forest Biology, Forest Sciences, Life Sciences, Other Ecology and Evolutionary Biology, Physical Sciences and Mathematics, Statistical Methodology, Statistical Models, Terrestrial and Aquatic Ecology
Ecosystems are highly valuable as a source of goods and services and as a heritage for future generations. Knowing their condition is extremely important for all management and conservation activities and public policies. Until now, the evaluation of ecosystem condition has been unsatisfactory and thus lacks practical implementation for most countries. We propose that ecosystem integrity is a [...]
otb: Creating a HiC/HiFi Pipeline to Assemble the Prosapia bicincta Genome
Published: 2023-12-05
Subjects: Agriculture, Bioinformatics, Computational Biology, Genomics, Other Animal Sciences
The implementation of a new genomic assembly pipeline named only the best [Genome Assembly Tools] (otb) has effectively addressed various challenges associated with data management during the development and storage of genome assemblies. otb, which incorporates a comprehensive pipeline involving a setup layer, quality checks, templating, and the integration of Nextflow and Singularity. The [...]
Towards causal relationships for modelling species distribution
Published: 2023-10-14
Subjects: Biodiversity, Bioinformatics, Life Sciences, Natural Resources and Conservation, Statistical Models
1. Understanding the processes underlying the distribution of species through space and time is fundamental in several research fields spanning from ecology to spatial epidemiology. Correlative species distribution models (SDMs) involve popular statistical tools to infer species geographical distribution thanks to spatiotemporally explicit observations of species occurrences coupled with a set of [...]
Best practices for genetic and genomic data archiving
Published: 2023-09-25
Subjects: Bioinformatics, Biology, Ecology and Evolutionary Biology, Genetics and Genomics, Life Sciences
Genetic and genomic data are collected for a vast array of scientific and applied purposes. Despite mandates for public archiving, data are typically used only by the generating authors. The reuse of genetic and genomic datasets remains uncommon because it is difficult, if not impossible, due to non-standard archiving practices and lack of contextual metadata. But as the new field of [...]
CasPEDIA Database: A Functional Classification System for Class 2 CRISPR-Cas Enzymes
Published: 2023-08-17
Subjects: Biochemistry, Biophysics, and Structural Biology, Bioinformatics, Life Sciences
CRISPR-Cas enzymes enable RNA-guided bacterial immunity and are widely used for biotechnological applications including genome editing. In particular, the Class 2 CRISPR-associated enzymes (Cas9, Cas12 and Cas13 families), have been deployed for numerous research, clinical and agricultural applications. However, the immense genetic and biochemical diversity of these proteins in the public domain [...]
STRyper: an open source macOS application for microsatellite genotyping and chromatogram management
Published: 2023-07-29
Subjects: Biodiversity, Bioinformatics, Molecular Genetics, Other Ecology and Evolutionary Biology
Microsatellite markers analyzed by capillary sequencing remain useful tools for rapid genotyping and low-cost studies. This contrasts with the lack of a free application to analyze chromatograms for microsatellite genotyping that is not restricted to human genotyping. To fill this gap, I have developed STRyper, a macOS application whose source code is published under the General Public License. [...]