Measuring the Semantic World – How to Map Meaning to High-Dimensional Entity Clusters in PubMed?

TitleMeasuring the Semantic World – How to Map Meaning to High-Dimensional Entity Clusters in PubMed?
Publication TypeConference Paper
Year of Publication2018
AuthorsWawrzinek, J., and W. - T. Balke
Conference NameThe 20th International Conference on Asia-Pacific Digital Libraries (ICADL)
Date Published11/2018
Conference LocationHamilton, New Zealand

The exponential increase of scientific publications in the medical field urgently calls for innovative access paths beyond the limits of a term-based search. As an example, the search term “diabetes” leads to a result of over 600,000 publications in the medical digital library PubMed. In such cases, the automatic extraction of semantic relations between important entities like active substances, diseases, and genes can help to reveal entity-relationships and thus allow simplified access to the knowledge embedded in digital libraries. On the other hand, for semantic-relation tasks distributional embedding models based on neural networks promise considerable progress in terms of accuracy, perfor-mance and scalability. Yet, despite the recent successes of neural networks in this field, questions arise related to their non-deterministic nature: Are the semantic relations meaningful, and perhaps even new and unknown entity-relationships? In this paper, we address this question by measuring the associations between important pharmaceutical entities such as active substances (drugs) and diseases in high-dimensional embedded space. In our investigation, we show that while on one hand only few of the contextualized associations directly correlate with spatial distance, on the other hand we have discovered their potential for predict-ing new associations, which makes the method suitable as a new, literature-based technique for important practical tasks like e.g., drug repurposing.

Camera-Ready of ICADL2018 Paper 38.pdf631.95 KB