Large-bodied birds are over-represented in unstructured citizen science data

This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.1038/s41598-021-98584-7. This is version 3 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Supplementary Files
Authors

Corey Thomas Callaghan, Alistair G. B. Poore, Max Hofmann, Christopher Roberts, Henrique Pereira

Abstract

Citizen science platforms are quickly accumulating hundreds of millions of biodiversity observations around the world annually. Quantifying and correcting for the biases in citizen science datasets remains an important first step before these data are used to address ecological questions and monitor biodiversity. One source of potential bias among datasets is the difference between those citizen science programs that have unstructured protocols and those that have semi-structured or structured protocols for submitting observations. To quantify biases in an unstructured citizen science platform, we contrasted bird observations from the iNaturalist platform with that from a semi-structured citizen science platform — eBird — for the continental United States. We tested whether four traits of species (color, flock size, body size, and commonness) predicted if a species was under- or over-represented in the unstructured dataset compared with the semi-structured dataset. We found strong evidence that large-bodied birds were over-represented in the unstructured citizen science dataset; moderate evidence that common species were over-represented in the unstructured dataset; moderate evidence that species in large groups were over-represented; and no evidence that colorful species were over-represented in unstructured citizen science data. Our results suggest that biases exist in unstructured citizen science data when compared with semi-structured data, likely as a result of the detectability of a species and the inherent recording process. Importantly, in programs like iNaturalist the detectability process is two-fold — first, an individual needs to be detected, and second, it needs to be photographed, which is likely easier for many large-bodied species. Our results indicate that caution is warranted when using unstructured citizen science data in ecological modelling, and highlight body size as a fundamental trait that can be used as a covariate for modelling opportunistic species occurrence records, representing the detectability or identifiability in unstructured citizen science datasets. Future research in this space should continue to focus on quantifying and documenting biases in citizen science data, and expand our research by including structured citizen science data to understand how biases differ among unstructured, semi-structured, and structured citizen science platforms.

DOI

https://doi.org/10.32942/osf.io/vnspb

Subjects

Biodiversity, Ecology and Evolutionary Biology, Life Sciences

Keywords

biases, Birds, citizen science, Community science, detectability, eBird, GBIF, iNaturalist, presence-only data, species occurrence data

Dates

Published: 2021-03-05 20:15

Last Updated: 2021-09-28 14:23

Older Versions
License

CC-By Attribution-NonCommercial-NoDerivatives 4.0 International