In this paper, we promote the idea of automatic semantic characterization of scientific claims to explore entity-entity relationships in Digital Collections. Our proposed approach aims at alleviating time-consuming analysis of query results when the information need is not just one document but an overview over a set of documents. With the semantic characterization, we propose to find what we called "dominant" claims and rely on two core properties: the consensual support of a claim in the light of the collection’s previous knowledge as well as the authors’ assertiveness of the language used when expressing it. We will discuss useful features to efficiently capture these two core properties and formalize the idea of finding “dominant” claims by relying on Pareto dominance. We demonstrate the effectiveness of our method regarding quality by a practical evaluation using a real-world document collection from the medical domain to show the potential of our approach.
|