Past and future uses of text mining in ecology & evolution

This is a Preprint and has not been peer reviewed. This is version 2 of this Preprint.

Add a Comment

You must log in to post a comment.


There are no comments or no comments have been made public for this article.


Download Preprint

Supplementary Files

Maxwell Jenner Farrell, Liam Brierley, Anna Willoughby, Andrew Yates, Nicole Mideo


Ecology and evolutionary biology, like other scientific fields, are experiencing an exponential
growth of academic manuscripts. As domain knowledge accumulates, scientists will need
new computational approaches for identifying relevant literature to read and include in
formal literature reviews and meta-analyses. Importantly, these approaches can also
facilitate automated, large-scale data synthesis tasks and build structured databases from
the information in the texts of primary journal articles, books, grey literature, and
websites. The increasing availability of digital text, computational resources, and
machine-learning based language models have led to a revolution in text analysis and
Natural Language Processing (NLP) in recent years. NLP has been widely adopted across
the biomedical sciences, but is rarely used in ecology and evolutionary biology. Applying
computational tools from text mining and NLP will increase the efficiency of data synthesis,
improve the reproducibility of literature reviews, formalize analyses of research biases and
knowledge gaps, and promote data-driven discovery of patterns across ecology and
evolutionary biology. Here we present recent use cases from ecology and evolution, and
discuss future applications, limitations, and ethical issues.



Biodiversity, Bioinformatics, Ecology and Evolutionary Biology, Life Sciences


biodiversity science, computational linguistics, database construction, document classification, Information Extraction, Information Retrieval, Named Entity Recognition, natural language processing, NLP, relation extraction, topic model


Published: 2022-02-17 05:03

Last Updated: 2022-04-05 21:59

Older Versions

CC-By Attribution-ShareAlike 4.0 International