This is a Preprint and has not been peer reviewed. This is version 2 of this Preprint.
prepR4pcm: An R Package for Preparing Data and Trees for Phylo- genetic Comparative Methods
Downloads
Authors
Abstract
1. Phylogenetic comparative methods require species names in a trait dataset to match tip labels in a
phylogenetic tree. Yet this apparently simple prerequisite is often one of the most fragile steps in a
comparative workflow. Names may differ because of, for example, formatting, taxonomic revisions,
synonyms, or spelling errors. If these differences are resolved informally, species can be lost from
analyses, and the reasons for their loss can be difficult to reconstruct.
2. Here, we present prepR4pcm, an R package for preparing data and trees for phylogenetic comparative
methods. The package reconciles species names through a staged procedure: exact matching,
normalised matching, synonym lookup with local taxonomic databases, and optional fuzzy matching
for likely spelling errors. Each decision is stored in a reconciliation object with the original name,
matched name, match type, confidence score, and a short explanation. This object turns name
matching from a hidden preprocessing step into an auditable part of the analysis.
3. prepR4pcm also supports the points where comparative workflows need human judgement. Users
can inspect unresolved names, accept or reject suggested matches, add manual corrections, apply
taxonomy crosswalks (which link names across taxonomic systems), compare reconciliation runs,
and generate reports. The package then returns a matched data frame and pruned tree with the
same species set, ready for phylogenetic generalised least squares, phylogenetic mixed models,
phylogenetic meta-analysis, and related workflows. If users do not yet have a tree, prepR4pcm can
retrieve trees from several sources, date trees when suitable information is available, and format
tree-source citations.
4. We illustrate the workflow using bundled datasets with realistic name mismatches. prepR4pcm is
available at https://github.com/itchyshin/prepR4pcm with documentation and vignettes covering
data and tree reconciliation, tree retrieval, multi-tree workflows, and phylogenetic meta-analysis.
DOI
https://doi.org/10.32942/X2468Z
Subjects
Ecology and Evolutionary Biology, Life Sciences
Keywords
phylogenetic comparative analysis, species names, taxonomic harmonisation, data–tree reconciliation, provenance
Dates
Published: 2026-06-17 00:08
Last Updated: 2026-06-17 00:08
Older Versions
License
CC BY Attribution 4.0 International
Additional Metadata
Data and Code Availability Statement:
https://github.com/itchyshin/prepR4pcm/
Language:
English
There are no comments or no comments have been made public for this article.