This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.3998/ptpbio.2101. This is version 4 of this Preprint.
Downloads
Authors
Abstract
Evolutionary and organismal biology have become inundated with data. At the same rate, we are experiencing a surge in broader evolutionary and ecological syntheses for which tree-thinking is the staple for a variety of post-tree analyses. To fully take advantage of this wealth of data to discover and understand large-scale evolutionary and ecological patterns, computational data integration, i.e. the use of machines to link data at large scale, is crucial. The most common shared entity by which evolutionary and ecological data need to be linked is the taxon to which they belong. We propose a set of requirements that a system for defining such taxa should meet for computational data science: taxon definitions should maintain conceptual consistency, be reproducible via a known algorithm, be computationally automatable, and be applicable across the tree of life. We argue that Linnaean names, the most prevalent means of linking data to taxa, fail to meet these requirements due to fundamental theoretical and practical shortfalls. We argue that for the purposes of data-integration we should instead use phylogenetic definitions transformed into formal logic expressions. We call such expressions phyloreferences, and argue that, unlike Linnaean names, they meet all requirements for effective data-integration.
DOI
https://doi.org/10.32942/osf.io/57yjs
Subjects
Biodiversity, Bioinformatics, Computer Sciences, Databases and Information Systems, Ecology and Evolutionary Biology, Engineering, Life Sciences, Other Ecology and Evolutionary Biology, Physical Sciences and Mathematics, Software Engineering
Keywords
computational semantics, data integration, phylogenetic definitions, phylogenetic taxonomy, phyloreferences, taxon concepts, Tree of Life, tree thinking
Dates
Published: 2021-03-05 18:35
Last Updated: 2021-08-09 14:29
There are no comments or no comments have been made public for this article.