This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Downloads
Supplementary Files
Authors
Abstract
Carbonic anhydrases (CAs) attract interest for their critical roles in various physiological processes and potential application in CO2 sequestration to combat global warming. Despite being an important enzyme family, the classification and evolution of CAs remain elusive due to their high sequence diversity and long evolutionary history. In this paper, the in-silico strategy, Motif-weighted Alignment for Structure-based Protein Classification (MASPC) was developed, which uses OmegaFold simulated CA structures combined with weighted structural motif alignment, TM-weighted, to facilitate more precise polymorphic analysis of large enzyme datasets in a robust manner. The MASPC strategy was first validated by 74 ground-truth CA structures extracted from PDB, showing improved performance compared to sequence-based polymorphic analysis (ClustalO-RAxML). Subsequently, MASPC was applied to analyze a representative database, which contains 1603 CAs from 117 model organisms, with focus on α-, β-, and- γ- CA classes, to cover organisms from across life evolution history. The results indicated that α-, β-, and γ-CAs were well grouped in their own classes, with clearer clustering associated with the CA’s organism. The structural differences among the α-, β-, and γ-CAs revealed by MASPC supported the current understanding that CA classes are the results of convergent evolution. The sub-clusters in α- and β-CAs are highly associated with organisms according to their appearance in evolutionary history, demonstrating a close correlation between CA evolution and life evolution. Furthermore, the MASPC method was also applied to identify 27 potential α-CAs from the NCBI database with less than 40% sequence similarity to a template human carbonic anhydrase II (HCA-II) sequence, demonstrating possible applications in enzyme identification studies.
DOI
https://doi.org/10.32942/X25S7R
Subjects
Bioinformatics, Life Sciences
Keywords
Protein, alignment, evolution, Carbonic Anhydrase, carbon capture
Dates
Published: 2025-02-24 10:01
Last Updated: 2025-02-24 10:01
License
CC BY Attribution 4.0 International
Additional Metadata
Language:
English
Conflict of interest statement:
None
Data and Code Availability Statement:
Data and code is available at https://github.com/resplendentHSHI/TMweighted
There are no comments or no comments have been made public for this article.