Effect of sampling strategies on the response curves estimated by plant species distribution models

This is a Preprint and has not been peer reviewed. This is version 3 of this Preprint.


Download Preprint

Supplementary Files

Manuele Bazzichetto, Jonathan Lenoir, Daniele Da Re, Enrico Tordoni, Duccio Rocchini, Marco Malavasi, Vojtech Barták, Marta Gaia Sperandii


Species distribution models (SDMs) rely on species presence/absence or abundance data and environmental variables to estimate species response curves. Therefore, the quality (and quantity, i.e., sample size) of the data to describe the species distribution determines the quality of the estimate of the species-environment relationship. However, SDMs are seldom fitted on high-quality data collected strictly for that purpose. Usually, SDMs rely on a collection of opportunistic datasets sampled from previous projects or public repositories with different objectives. Here, we aim at assessing how the sampling strategy capturing the geographic distribution of a species affects the accuracy and precision of its response curves along environmental gradients, as estimated by parametric SDMs. We simulated the occurrence of two virtual plant species across the Abruzzo region (Italy). We assumed that the two virtual plants were similarly affected by precipitation, but one had a wider realised niche for temperature (i.e., higher thermal tolerance), and, as a result, a wider distribution extent. Then, we sampled occurrence data for the two species following five different sampling strategies: random, stratified, systematic, topographic, and uniform (the latter performed within the environmental space). In addition, we simulated a spatially biased sampling design by collecting presence/absence data close to roads. To account for sample size, we also repeated our simulations along a gradient of increasing sampling effort, i.e., number of sampled locations. In total, we ran 500 replicates for each combination of sampling design and effort. For each replicate, we fitted SDMs using binomial generalised linear models and extracted the model coefficients for precipitation and temperature to be compared with the true coefficients from the virtual species’ model. We evaluated the quality of the estimated response curves by computing the following measures: bias (accuracy), variance (precision), and mean squared error (accuracy and precision). Our results suggest that a proper estimate of the species response curve can be obtained when the choice of the sampling strategy is guided by the species’ ecology. In particular, species with wide tolerances to environmental drivers may be better modelled using data uniformly collected within the environmental space, while none of the tested sampling designs seemed to substantially outperform the others for modelling species with a narrow realised niche.




Biodiversity, Ecology and Evolutionary Biology, Life Sciences, Other Life Sciences, Plant Sciences


Bias, ecological niche breadth, environmental space, mean squared error, sampling bias, Simulation, virtual species


Published: 2022-08-27 20:52

Last Updated: 2022-08-29 06:47

Older Versions

CC-By Attribution-ShareAlike 4.0 International

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.