This is a limited proof of concept to search for research data, not a production system.

Search the MIT Libraries

Title: Rapid Creation of a Data Product for the World's Specimens of Horseshoe Bats and Relatives, a Known Reservoir for Coronaviruses

Type Dataset Mast, Austin R., Paul, Deborah L., Rios, Nelson, Bruhn, Robert, Dalton, Trevor, Krimmel, Erica R., Pearson, Katelin D., Sherman, Aja, Shorthouse, David P., Simmons, Nancy B., Soltis, Pam, Upham, Nathan (2021): Rapid Creation of a Data Product for the World's Specimens of Horseshoe Bats and Relatives, a Known Reservoir for Coronaviruses. Zenodo. Dataset. https://zenodo.org/record/4743102

Authors: Mast, Austin R. (Florida State University) ; Paul, Deborah L. (Florida State University) ; Rios, Nelson (Yale University Peabody Museum of Natural History) ; Bruhn, Robert (Florida State University) ; Dalton, Trevor (Florida State University) ; Krimmel, Erica R. (Florida State University) ; Pearson, Katelin D. (Florida State University) ; Sherman, Aja (Florida State University) ; Shorthouse, David P. (Agriculture and Agri-Food Canada) ; Simmons, Nancy B. (American Museum of Natural History) ; Soltis, Pam (University of Florida) ; Upham, Nathan (Arizona State University) ;

Links

Summary

This repository is associated with NSF DBI 2033973, RAPID Grant: Rapid Creation of a Data Product for the World's Specimens of Horseshoe Bats and Relatives, a Known Reservoir for Coronaviruses (https://www.nsf.gov/awardsearch/showAward?AWD_ID=2033973). Specifically, this repository contains (1) raw data from iDigBio (http://portal.idigbio.org) and GBIF (https://www.gbif.org), (2) R code for reproducible data wrangling and improvement, (3) protocols associated with data enhancements, and (4) enhanced versions of the dataset published at various project milestones. Additional code associated with this grant can be found in the BIOSPEX repository (https://github.com/iDigBio/Biospex). Long-term data management of the enhanced specimen data created by this project is expected to be accomplished by the natural history collections curating the physical specimens, a list of which can be found in this Zenodo resource.

Grant abstract: "The award to Florida State University will support research contributing to the development of georeferenced, vetted, and versioned data products of the world's specimens of horseshoe bats and their relatives for use by researchers studying the origins and spread of SARS-like coronaviruses, including the causative agent of COVID-19. Horseshoe bats and other closely related species are reported to be reservoirs of several SARS-like coronaviruses. Species of these bats are primarily distributed in regions where these viruses have been introduced to populations of humans. Currently, data associated with specimens of these bats are housed in natural history collections that are widely distributed both nationally and globally. Additionally, information tying these specimens to localities are mostly vague, or in many instances missing. This decreases the utility of the specimens for understanding the source, emergence, and distribution of SARS-COV-2 and similar viruses. This project will provide quality georeferenced data products through the consolidation of ancillary information linked to each bat specimen, using the extended specimen model. The resulting product will serve as a model of how data in biodiversity collections might be used to address emerging diseases of zoonotic origin. Results from the project will be disseminated widely in opensource journals, at scientific meetings, and via websites associated with the participating organizations and institutions. Support of this project provides a quality resource optimized to inform research relevant to improving our understanding of the biology and spread of SARS-CoV-2. The overall objectives are to deliver versioned data products, in formats used by the wider research and biodiversity collections communities, through an open-access repository; project protocols and code via GitHub and described in a peer-reviewed paper, and; sustained engagement with biodiversity collections throughout the project for reintegration of improved data into their local specimen data management systems improving long-term curation.

This RAPID award will produce and deliver a georeferenced, vetted and consolidated data product for horseshoe bats and related species to facilitate understanding of the sources, distribution, and spread of SARS-CoV-2 and related viruses, a timely response to the ongoing global pandemic caused by SARS-CoV-2 and an important contribution to the global effort to consolidate and provide quality data that are relevant to understanding emergent and other properties the current pandemic. This RAPID award is made by the Division of Biological Infrastructure (DBI) using funds from the Coronavirus Aid, Relief, and Economic Security (CARES) Act.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria."

Files included in this resource

9d4b9069-48c4-4212-90d8-4dd6f4b7f2a5.zip: Raw data from iDigBio, DwC-A format 0067804-200613084148143.zip: Raw data from GBIF, DwC-A format 0067806-200613084148143.zip: Raw data from GBIF, DwC-A format 1620226888.zip: Full export of this project's data (enhanced and raw) from BIOSPEX, CSV format bionomia-datasets-attributions.zip: Directory containing 103 Frictionless Data packages for datasets that have attributions made containing Rhinolophids or Hipposiderids, each package also containing a CSV file for mismatches in person date of birth/death and specimen eventDate. File bionomia-datasets-attributions-key_2021-02-25.csv included in this directory provides a key between dataset identifier (how the Frictionless Data package files are named) and dataset name. bionomia-problem-dates-all-datasets_2021-02-25.csv: List of 21 Hipposiderid or Rhinolophid records whose eventDate or dateIdentified mismatches a wikidata recipient’s date of birth or death across all datasets. RAPID-code_collection-date.R: code associated with enhancing collection dates RAPID-code_compile-deduplicate.R: code associated with compiling and deduplicating raw data RAPID-code_standardize-country.R: code associated with standardizing country data RAPID-code_external-linkages-bold.R: code associated with enhancing external linkages RAPID-code_external-linkages-genbank.R: code associated with enhancing external linkages RAPID-code_external-linkages-standardize.R: code associated with enhancing external linkages RAPID-code_people.R: code associated with enhancing data about people RAPID-code_standardize-country.R: code associated with standardizing country data rapid-data-providers_2021-05-03.csv: list of data providers and number of records provided to rapid-joined-records_country-cleanup_2020-09-23.csv rapid-final-data-product_2021-05-07.csv: Enhanced dataset, final version from BIOSPEX rapid-joined-records_country-cleanup_2020-09-23.csv: data product initial version where raw data has been compiled and deduplicated, and country data has been standardized RAPID-protocol_collection-date.pdf: protocol associated with enhancing collection dates RAPID-protocol_compile-deduplicate.pdf: protocol associated with compiling and deduplicating raw data RAPID-protocol_external-linkages.pdf: protocol associated with enhancing external linkages RAPID-protocol_georeference.pdf: protocol associated with georeferencing RAPID-protocol_people.pdf: protocol associated with enhancing data about people RAPID-protocol_standardize-country.pdf: protocol associated with standardizing country data RAPID-protocol_taxonomic-names.pdf: protocol associated with enhancing taxonomic name data RAPIDAgentStrings1_archivedCopy_30March2021.ods: resource used in conjunction with RAPID people protocol Rhinolophid-HipposideridAgentStrings_and_People2_archivedCopy_30March2021.ods: resource used in conjunction with RAPID people protocol wikidata-notes-for-bat-collectors_leachman_2020.docx: resource used in conjunction with RAPID people protocol wikidata-notes-for-bat-collectors_leachman_2020.pdf: resource used in conjunction with RAPID people protocol

More information

  • DOI: 10.5281/zenodo.4743102
  • Language: en

Subjects

  • natural history collection, biodiversity collection, specimen, horseshoe bats, coronavirus, COVID-19, biodiversity informatics

Dates

  • Publication date: 2021
  • Issued: May 07, 2021

Notes

Other: Funding by the U.S. National Science Foundation DBI 2033973.

Rights


Much of the data past this point we don't have good examples of yet. Please share in #rdi slack if you have good examples for anything that appears below. Thanks!

Format

electronic resource

Relateditems

DescriptionItem typeRelationshipUri
IsVersionOfhttps://doi.org/10.5281/zenodo.3974999
IsPartOfhttps://zenodo.org/communities/covid-19
IsPartOfhttps://zenodo.org/communities/zenodo