A Semantically Enriched Dataset based on Biomedical NER for the COVID19 Open Research Dataset Challenge (Data Publication)

TitleA Semantically Enriched Dataset based on Biomedical NER for the COVID19 Open Research Dataset Challenge (Data Publication)
Publication TypeConference Paper
Year of Publication2020
AuthorsKroll, H., J. Pirklbauer, J. Ruthmann, and W. - T. Balke
Conference NamearXiv:2005.08823
Date Published05/2020
Abstract

Research into COVID-19 is a big challenge and highly relevant at the moment. New tools are required to assist medical experts in their research with relevant and valuable information. The COVID-19 Open Research Dataset Challenge (CORD-19) is a "call to action" for computer scientists to develop these innovative tools. Many of these applications are empowered by entity information, i. e. knowing which entities are used within a sentence. For this paper, we have developed a pipeline upon the latest Named Entity Recognition tools for Chemicals, Diseases, Genes and Species. We apply our pipeline to the COVID-19 research challenge and share the resulting entity mentions with the community.

AttachmentSize
ARXIV2020_Entity-Tagged_CoVid19_Dataset.pdf374.75 KB