A standard protocol for harvesting biodiversity data from Facebook

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.


Download Preprint


Shawan Chowdhury , Sultan Ahmed, Shofiul Alam, Corey T Callaghan , Priyanka Das, Moreno Di Marco, Enrico Di Minin, Ivan Jarić, Mahzabin Muzahid Labi, Md Rokonuzzaman, Uri Roll, Valerio Sbragaglia, Asma Siddika, Aletta Bonn


1. The expanding use of citizen science platforms has led to an exponential increase in biodiversity data in global repositories. Yet, our understanding of species distribution remains patchy for most of the world. Social media data has the potential to reduce the global biodiversity knowledge gap. However, practical guidelines and standardised pipelines to harvest such data sources are still missing.
2. Here, we provide a standardised framework to extract species distribution records from Facebook groups that allow access to their data following data privacy and protection safeguards. Facebook groups are actively used and moderated in some countries to share species records. We present how to structure keywords, search for species photographs, and georeference localities for such records. We further highlight some challenges that users might face when extracting species distribution data from Facebook and suggest potential solutions.
3. Following our proposed framework, we present a case study on Bangladesh’s biodiversity – a tropical megadiverse South Asian country. We scraped nearly 45,000 unique locality data for 967 species, with a median of 27 records per species. About 12% of the distribution data were for threatened species, which represent 27% of all species. We also obtained data for 56 Data Deficient species.
4. If carefully harvested, social media data can significantly reduce global biodiversity knowledge gaps. Consequently, developing an automated tool to extract and interpret social media biodiversity data is an essential research priority.




Ecology and Evolutionary Biology, Life Sciences


Wallacean shortfall, megadiverse countries, tropics, social media, iEcology, Facebook, crowdsourcing, citizen science, Bangladesh


Published: 2023-08-27 10:11

Last Updated: 2023-08-27 14:11


CC BY Attribution 4.0 International

Additional Metadata


Data and Code Availability Statement:
Our extracted Facebook data are publicly available (Chowdhury et al., 2022b).