Skip to main content
Shared acoustic manifolds for exploratory comparison of passerine vocalizations

Shared acoustic manifolds for exploratory comparison of passerine vocalizations

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Supplementary Files

Authors

Lucio Arese 

Abstract

This study presents a fixed-parameter pipeline designed to support reproducible embedding of frame-level representations of multiple passerine vocalizations within shared low-dimensional spaces. Three passerine species are considered: Eurasian Wren, Tree Pipit and Common Chaffinch, with a selection of four individuals for each species group. Vocalization frames from each species group are mapped into a single three-dimensional coordinate system to allow comparison between individuals while preserving temporal continuity. The pipeline operates under a controlled protocol with an unsupervised, geometry-first exploratory approach. Two feature representations are used: MFCC (40 coefficients with delta and delta-delta) and 80-bin chroma vectors. The two feature sets provide complementary analytical lenses on the signal, ranging from spectral-envelope dynamics to relative frequency organization, without imposing discrete musical categories. The dimensionality-reduction process features a PCA-20 preconditioning step followed by a UMAP embedding, resulting in a total of six manifolds (two feature spaces x three species). The resulting embeddings are visualized as continuous trajectories in two separate layouts: a view with individual identity separated by solid coloring and another augmented view with descriptor overlays as color coding, applied post-embedding. The descriptors include spectral centroid and a chroma-derived concentration measure (Chroma Energy Concentration or CEC, introduced in this work), visualized as scalar fields on the manifold geometry. A supplementary case study demonstrates event-level backtracking from localized manifold regions to the underlying audio, enabling identification of recurring vocal events concentrated in specific embedding regions. The framework operates independently of labeling or categorization: it provides a descriptive interface intended to complement spectrogram-based analysis, supporting qualitative comparison and hypothesis generation.

DOI

https://doi.org/10.32942/X2W65N

Subjects

Behavior and Ethology, Ecology and Evolutionary Biology, Research Methods in Life Sciences

Keywords

bioacoustics, acoustic features, MFCC, chroma, UMAP, dimensionality reduction, exploratory, manifold learning, Unsupervised learning, passerine vocalizations, birdsong, backtracking

Dates

Published: 2026-01-23 19:38

Last Updated: 2026-01-23 19:38

License

CC BY Attribution 4.0 International

Additional Metadata

Conflict of interest statement:
None

Data and Code Availability Statement:
Associated CSV outputs and supplementary videos are available via Zenodo (DOI: https://doi.org/10.5281/zenodo.18332166). Analytical code is not publicly released, but is available from the author upon reasonable request.

Language:
English