NCBITaxonomy.jl - rapid biological names finding and reconciliation

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Timothée Poisot , Rory Gibb, Sadie Jane Ryan, Colin J. Carlson

Abstract

NCBITaxonomy.jl is a package designed to facilitate the reconciliation and cleaning of taxonomic names, using a local copy of the NCBI taxonomic backbone (Federhen 2012, Schoch et al. 2020); The basic search functions are coupled with quality-of-life functions including case-insensitive search and custom fuzzy string matching to facilitate the amount of information that can be extracted automatically while allowing efficient manual curation and inspection of results. NCBITaxonomy.jl works with version 1.6 of the Julia programming language (Bezanson et al. 2017), and relies on the Apache Arrow format to store a local copy of the NCBI raw taxonomy files. The design of NCBITaxonomy.jl has been inspired by similar efforts, like the R package taxadb (Norman et al. 2020), which provides an offline alternative to packages like taxize (Chamberlain and Szöcs 2013).

DOI

https://doi.org/10.32942/osf.io/uvbfj

Subjects

Biodiversity, Life Sciences

Keywords

ecoinformatics, NCBI, taxonomy

Dates

Published: 2021-10-21 12:33

License

CC-By Attribution-ShareAlike 4.0 International