This is a Preprint and has not been peer reviewed. This is version 2 of this Preprint.
Addressing Unobserved Covariates in Species Distribution Models: Impacts on Inferential Quality and Mitigation via Joint Species Distribution Models
Downloads
Authors
Abstract
Species distribution models (SDMs) are widely used in ecology to assess the distribution of species populations across space and time. Correlative SDMs, in particular, are used to infer relationships between species records and environmental variables. A classical approach for implementing this type of SDMs is to employ generalized linear mixed models (GLMMs) as a parametric regression method. However, due to the complexity of species-environment relationships, species distributions may depend on unobserved or unmeasurable covariates. In this article, we first recall certain mathematical results showing that such “omitted covariates” typically introduce statistical issues that can bias the inference of observed covariate effects or yield improper confidence intervals. So far, these results have received little attention in ecology. We then present a comprehensive simulation-based investigation of the statistical impact of unobserved covariates on the inference performance of GL(M)Ms for continuous, count, and binary data. We assessed various regression methods, including both frequentist and Bayesian SDMs, and so-called joint species distribution models (JSDMs) used to account for interspecific covariations in presence–absence data. Our work demonstrates that JSDMs provide a robust statistical approach that mitigates inferential issues arising in SDMs due to missing covariates and enables reliable estimates of environmental effects. We further complemented these simulation results by applying JSDMs and SDMs to several ecological datasets, revealing discrepancies between SDM and JSDM estimation of environmental effects and a better predictive capacity for JSDMs than for SDMs. As a general recommendation, we encourage ecologists and practitioners to consider fitting JSDMs when dealing with community data to be able to evaluate whether any information can be extracted from between-species residuals. Ultimately, our results remain broadly applicable to GL(M)Ms in which important variables are suspected of being omitted, in which case generalized linear latent variable models (GLLVMs) could properly correct inference when different entities might share the same omitted important covariate.
DOI
https://doi.org/10.32942/X2RS9S
Subjects
Life Sciences, Physical Sciences and Mathematics
Keywords
Unobserved covariate, Missing covariate, Omitted variable bias (OVB), Species distribution model (SDM), Joint species distribution model (JSDM), Model misspecification, Bias mitigation, Generalized linear latent variable model (GLLVM)
Dates
Published: 2026-03-04 12:30
License
CC BY Attribution 4.0 International
Additional Metadata
Conflict of interest statement:
None declared.
Data and Code Availability Statement:
Codes are available in the first author's GitHub repository. Datasets are already accessible through the corresponding references.
Language:
English
There are no comments or no comments have been made public for this article.