Towards Semantic Quality Enhancement of User Generated Content

TitleTowards Semantic Quality Enhancement of User Generated Content
Publication TypeConference Paper
Year of Publication2018
AuthorsPinto, J. M. G., N. Kiehne, and W. - T. Balke
Conference NameThe 20th International Conference on Asia-Pacific Digital Libraries (ICADL)
Date Published11/2018
Conference LocationHamilton, New Zealand

With the increasing amount of user-generated content such as scientific blogs, questioning-answering archives (Quora or Stack Overflow), and Wikipedia, the challenge to evaluate quality naturally arises.  Previous work has shown the potential to evaluate automatically such content focusing on syntactic and pragmatic levels such as conciseness, organization, and readability. We push forward these efforts and focus on how to develop an intelligent service to ease the engagement of users in two semantic attributes: factual accuracy, e.g., whether facts are correct and validity, e.g., whether reliable sources support the content.  To do so, we deploy a Deep Learning approach to learn citation categories from Wikipedia. Thus, we introduce an automatic mechanism that can accurately determine what specific citation category is needed to help users increase the value of their contribution at a semantic level.  To that end, we automatically learn linguistic patterns from Wikipedia to support a broad range of fields. We extensively evaluated several machine learning models to learn from more than one million annotated sentences from the massive effort of Wikipedia contributors. We evaluate the performance of the different methods and present a profound analysis focusing on the balance accuracy achieved.

Camera-Ready of ICADL2018 Paper 40.pdf505.26 KB