An operational workflow for producing periodic estimates of species occupancy at large scales

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.


Download Preprint


Rob James Boyd, Tom August, Robert Cooke, Mark Logie, Francesca Mancini, Gary Powney, David Roy, Katharine Turvey, Nick Isaac


Policy makers require high-level summaries of biodiversity change. However, deriving such summaries from raw biodiversity data is a complex process involving several intermediary stages. In this paper, we describe a workflow for generating annual estimates of species’ occupancy at national scales from raw species occurrence data, which can be used to construct a range of policy-relevant biodiversity indicators. We describe the workflow in detail: from data acquisition, data assessment and data manipulation, through modelling, model evaluation, application and dissemination. At each stage, we draw on our experience developing and applying the workflow for almost a decade to outline the challenges that analysts might face. These challenges span many areas of ecology, taxonomy, data science, computing and statistics. In our case, a key output of the workflow is annual estimates of occupancy, with measures of uncertainty, for over 5,000 species in each of several defined “regions” (e.g., countries, protected areas, etc.) of the United Kingdom from 1970-2019. This product corresponds closely to the notion of a species distribution “Essential Biodiversity Variable” (EBV). Throughout the paper, we note where the workflow can be adapted to other situations (e.g., geographic regions or data types). We also highlight areas where the workflow can be improved; in particular, we suggest incorporation of methods to diagnose biases in the species occurrence data, to understand whether and to what extent these bias downstream products, and to mitigate them if needed. Finally, we compare the data products generated using our workflow to the first generation of species distribution EBVs and the “idealized” product as defined by others. Going forward, we hope that this paper can act as a template for research groups around the world seeking to develop similar data products.



Biodiversity, Bioinformatics, Ecology and Evolutionary Biology, Life Sciences


biodiversity, citizen science, Essential Biodiversity Variable, occupancy model, species distributions


Published: 2022-06-23 16:56


CC-By Attribution-ShareAlike 4.0 International