This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
Genome-wide strengthening of evolutionary constraint across the volvocine multicellularity gradient
Downloads
Supplementary Files
Authors
Abstract
The evolutionary cost of maintaining somatic cell populations has been hypothesised to drive proteome-wide strengthening of constraint on coding sequence at the origin of multicellularity. Prior work on the volvocine clade has reported lower d_N/d_S in colonial than in unicellular relatives on 55 chloroplast genes (Hu et al. 2019) and on 105 nuclear single-copy orthogroups across colonial species (Hu et al. 2020), but a nuclear genome-wide branch-model analysis together with per-species phylogenetic regression, an orthogonal amino-acid branch-length analysis with non-independence-corrected significance, a functional-category breakdown, and within-clade replication has been lacking. Here, ten volvocine green algae spanning the unicellular Chlamydomonas reinhardtii, the colonial intermediates Tetrabaena, Gonium, the independently evolved partial-soma lineage Astrephomene gubernaculifera (Goniaceae), Yamagishiella, Eudorina, Pleodorina, and three species of Volvox (V. carteri, V. africanus, V. reticuliferus) were analysed by branch-model d_N/d_S on 1,755 single-copy nuclear orthogroups (845 retained after quality control), an order of magnitude larger than the closest prior nuclear analysis (Hu et al. 2020). Foreground (multicellular) omega was lower than background (Chlamydomonas) omega in 88.6 % of orthogroups (median omega_fg = 0.046, median omega_bg = 0.117); the pattern occurred across all 20 COG functional categories analysed (range 78-100 % per category). Independent IQ-TREE amino-acid branch-length estimation under LG+Gamma on 1,456 orthogroups recovered the same direction of pattern under an entirely orthogonal substitution model: 73.1 % of orthogroups had shorter aggregated multicellular terminal branches than the Chlamydomonas branch, and 96.1 % of orthogroups (1,395/1,451 with complete data across all ten species) exhibited a negative per-gene Spearman correlation between log2(cell number) and amino-acid branch length (species-label permutation test, 10,000 permutations: empirical one-sided P = 0.0107); we note that amino-acid branch length conflates substitution rate, generation time, and selective constraint, and therefore provides a direction-of-pattern validation rather than an independent estimate of selection intensity. A per-species phylogenetic generalised least squares (PGLS) regression of log(median amino-acid branch length) on log2(cell number) was negative (slope = -0.123, P = 0.002, R-squared = 0.72, Pagel's lambda_ML approximately 0; n = 10). Sensitivity analyses confirmed robustness to orthogroup filtering (strict-quality P = 0.0004), to complete exclusion of V. reticuliferus (which exhibited assembly-related orthogroup-level saturation; P = 0.006), and to additional exclusion of Tetrabaena (a known rate-elevated outlier; P = 0.015). Cautionary reanalyses of choanoflagellate-to-sponge and brown-algal transitions illustrate how codon-model saturation and asymmetric quality filtering can substantially distort apparent branch-model signals at deeper transitions. The ten-species result extends the previously reported chloroplast and limited-nuclear signal (Hu et al. 2019, 2020) to the nuclear proteome scale with per-species phylogenetic correction, per-gene gradient analysis, functional-category breakdown, and within-clade replication, and is consistent with (though not by itself a direct test of) the disposable-soma prediction that somatic-maintenance investment strengthens with multicellular complexity.
DOI
https://doi.org/10.32942/X2DD40
Subjects
Computational Biology, Ecology and Evolutionary Biology, Evolution, Genetics and Genomics, Genomics
Keywords
Volvocine algae, Multicellularity evolution, Coding-sequence constraint, dN/dS, PAML branch model, IQ-TREE, Phylogenetic regression, Somatic maintenance, Comparative genomics, Disposable soma theory
Dates
Published: 2026-07-03 00:59
Last Updated: 2026-07-03 00:59
License
CC-By Attribution-NonCommercial-NoDerivatives 4.0 International
Additional Metadata
Conflict of interest statement:
None
Data and Code Availability Statement:
All analysis scripts, intermediate data files, PAML output, IQ-TREE output, eggNOG annotations, per-orthogroup outputs, summary tables, figures, and supplementary tables are deposited at Zenodo (DOI: 10.5281/zenodo.21094823; https://doi.org/10.5281/zenodo.21094823) under CC BY 4.0 (data/figures) and MIT (scripts) licenses. Genome accessions for the ten volvocine species used in the analysis are listed in Supplementary Table S1.
Language:
English
Metrics
Views: 21
Downloads: 3
There are no comments or no comments have been made public for this article.