BAFREC: Balancing Frequency and Rarity for Entity Characterization in Open Linked Data

TitleBAFREC: Balancing Frequency and Rarity for Entity Characterization in Open Linked Data
Publication TypeConference Paper
Year of Publication2018
AuthorsKroll, H., D. Nagel, and W. - T. Balke
Conference NameProceedings of the ACM CIKM 2018 Workshops
Date Published10/2018
Conference LocationTurin, Italy
Abstract

Today’s growth of linked open data (LOD) sources calls for summa- rization systems to help users to navigate through large volumes of data. A major task is entity summarization, where a most meaningful subset of all available information about entities has to be selected. In particular, the selected information has to characterize each en- tity with high precision. These summaries can then be used for a wide variety of applications such as initial presentation (so-called info boxes), entity disambiguation, or entity reconciliation. This paper introduces BAFREC, a novel entity summarization method balancing frequency and rarity metrics for all entity properties in a sophisticated manner. In contrast to simply choosing most popular or most frequent concepts, we design a new strategy: BAFREC first splits all facts about some entity into categories and then rates each category using a specifically tailored metric. For instance, some facts like type information are preferred with respect to their rar- ity, i.e. picking the most specialized concept, while others may be rated according to their general popularity. The evaluation against the ESBM benchmark shows that especially for computing short summaries, BAFREC outperforms commonly applied approaches.