This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.
Downloads
Authors
Abstract
Automated species detection in camera trap images with deep learning techniques has become common in ecological monitoring. Camera trap image data sets are a challenging task, because of modest data set size, high class imbalance owing to low prevalence of the species of interest, and image backgrounds that vary within and between cameras. Strategies to tackle these difficulties can be adopted at the data handling and pre-processing stage, in the choice of model architecture, and during model training. We here report on insights regarding these strategies from a case study that aimed to detect a large wading bird (grey heron, Ardea cinerea) in images from different camera traps. Model performance improved with data splitting according to a non-random strategy, higher resolution images, and standard minority oversampling with data augmentation in color space. An object detection architecture (YOLOv5x6) performed better than an image classification architecture (MobileNetV2), while using fewer computing resources. Transfer learning through initial weights derived from models pre-trained on similar data was beneficial, but fine-tuning models on the data set at hand remained important. Finally, we highlight the dependence of predictive performance on class imbalance, and the assumption that the prevalence in the test set is representative of intended application sets. We discuss different performance metrics, emphasizing the importance of reporting the complete set of basic metrics along with the test set prevalence, and illustrate the use of metrics in downstream ecological analyses.
DOI
https://doi.org/10.32942/X2VW4V
Subjects
Life Sciences
Keywords
Dates
Published: 2024-12-27 11:17
License
Additional Metadata
Language:
English
There are no comments or no comments have been made public for this article.