- Home
- Search
- Images
- Datasets
- Sample Use
- How to Cite
- Additional Information
- About NEON
- NEON Data Portal
- ASU Biocollections
- About Symbiota
Automated invertebrate classification using computer vision has shown significant potential to improve specimen processing efficiency. However, challenges such as invertebrate diversity and morphological similarity among taxa can make it difficult to infer fine-scale taxonomic classifications using computer vision. As a result, many invertebrate computer vision models are forced to make classifications at coarser levels, such as at family or order.
Here we propose a novel framework to combine computer vision and bulk DNA metabarcoding specimen processing pipelines to improve the accuracy and taxonomic granularity of individual specimen classifications. To improve specimen classification accuracy, our framework uses multimodal fusion models that combine image data with DNA-based assemblage data. To refine the taxonomic granularity of the model’s classifications, our framework cross-references the classifications with DNA metabarcoding detections from bulk samples. We demonstrated this framework using a continental-scale, invertebrate bycatch dataset collected by the National Ecological Observatory Network. The dataset included 17 taxa spanning three phyla (Annelida, Arthropoda, and Mollusca), with the finest starting taxonomic granularity of these taxa being order-level.
Using this framework, we reached a classification accuracy of 79.6% across the 17 taxa using real DNA assemblage data, and 83.6% when the assemblage data was “error-free”, resulting in a 2.2% and 6.2% increase in accuracy when compared to a model trained using only images. After cross-referencing with the DNA metabarcoding detections, we improved taxonomic granularity in up to 72.2% of classifications, with up to 5.7% reaching species-level.
By providing computer vision models with coincident DNA assemblage data, and refining individual classifications using DNA metabarcoding detections, our framework has the potential to greatly expand the capabilities of biological computer vision classifiers. This framework allows computer vision classifiers to infer taxonomically fine-grained classifications when it would otherwise be difficult or impossible due to challenges of morphologic similarity or data scarcity. This framework is not limited to terrestrial invertebrates and could be applied in any instance where image and DNA metabarcoding data are concurrently collected.
This dataset includes 0 records.
NEON Biorepository Data Portal. Blair et al. 2024. A hybrid approach to biomonitoring (ID: 189) https://biorepo.neonscience.org/portal/collections/list.php?datasetid=189 accessed 2024-12-08).