Context-Compatible Information Fusion for Scientific Knowledge Graphs

TitleContext-Compatible Information Fusion for Scientific Knowledge Graphs
Publication TypeConference Paper
Year of Publication2020
AuthorsKroll, H., J. - C. Kalo, D. Nagel, S. Mennicke, and W. - T. Balke
Conference Name24th International Conference on Theory and Practice of Digital Libraries (TPDL)
Date Published08/2020
Conference LocationLyon, France

Currently, a trend to augment document collections with entity-centric knowledge provided by knowledge graphs is clearly visible, especially in scientific digital libraries. Entity facts are either manually curated, or for higher scalability automatically harvested from large volumes of text documents. The often claimed benefit is that a collection-wide fact extraction combines information from huge numbers of documents into one single database. However, even if the extraction process would be 100% correct, the promise of pervasive information fusion within retrieval tasks poses serious threats with respect to the results' validity. This is because important contextual information provided by each document is often lost in the process and cannot be readily restored at retrieval time. In this paper we quantify the consequences of uncontrolled knowledge graph evolution in real world scientific libraries using NLM's PubMed corpus vs. the SemMedDB knowledge base. Moreover, we operationalise the notion of implicit context as a viable solution to gain a sense of context-compatibility for all extracted facts based on the pair-wise coherence of all documents used for extraction: Our derived measures for context compatibility determine which facts are relatively safe to combine. Moreover, they allow to balance between precision and recall. Our practical experiments extensively evaluate context-compatibility based on implicit contexts for typical digital library tasks. The results show that our implicit notion of context compatibility is superior to existing methods in terms of both, simplicity and retrieval quality.

TPDL2020_Kroll_Camera_Ready.pdf308.77 KB