This is a Preprint and has not been peer reviewed. This is version 2 of this Preprint.
Downloads
Supplementary Files
Authors
Abstract
Species delimitation is the process of distinguishing between populations of the same species and distinct species of a particular group of organisms. Various methods exist for inferring species limits, whether based on morphological, molecular, or other types of data. In the case of methods based on DNA sequences, most of them are rooted in the coalescent theory. However, coalescence-based models have limitations, especially regarding complex evolutionary scenarios, large datasets, and varying genetic data types. In this context, machine learning (ML) can be considered as a promising analytical tool, and provides an effective way to explore dataset structures when species-level divergences are hypothesized. In this review, we examine the use of ML in species delimitation and provide an overview and critical appraisal of existing workflows. We also provide simple explanations on how the main types of ML approaches operate, which should help uninitiated researchers and students interested in the field. Our review suggests that while current ML methods designed to infer species limits are analytically powerful, they also present specific limitations and should not be considered as definitive alternatives to coalescent methods for species delimitation. On the other hand, such variability might also represent an advantage, highlighting the flexibility of ML algorithms. Future enterprises should consider the constraints related to the use of simulated data, as in other model-based methods relying on simulations. We also propose best practices for the use of ML methods in species delimitation, offering insights into potential future applications. We expect that the proposed guidelines will be useful for enhancing the accessibility, effectiveness, and objectivity of ML in species delimitation.
DOI
https://doi.org/10.32942/X2W313
Subjects
Biology, Computational Biology, Ecology and Evolutionary Biology, Genetics and Genomics
Keywords
Bioinformatics, molecular data, speciation, phylogenetics, phylogenomics, Artificial intelligence, deep learning., molecular data, speciation, phylogenetics, phylogenomics, Artificial Intelligence, Deep learning
Dates
Published: 2023-12-07 12:20
Last Updated: 2024-10-09 03:47
Older Versions
License
CC BY Attribution 4.0 International
Additional Metadata
Language:
English
Data and Code Availability Statement:
Not applicable
There are no comments or no comments have been made public for this article.