Pushing the Boundaries of Crowd-Enabled Databases with Query-Driven Schema Expansion

TitlePushing the Boundaries of Crowd-Enabled Databases with Query-Driven Schema Expansion
Publication TypeConference Paper
Year of Publication2012
AuthorsSelke, J., C. Lofi, and W. - T. Balke
Conference Name38th Int. Conf. on Very Large Data Bases (VLDB)
Date Published08/12
Publisherin PVLDB 5(6)
Conference LocationIstanbul, Turkey

By incorporating human workers into the query execution process crowd-enabled databases facilitate intelligent, social capabilities like completing missing data at query time or performing cognitive operators. But despite all their flexibility, crowd-enabled databases still maintain rigid schemas. In this paper, we extend crowd-enabled databases by flexible query-driven schema expansion, allowing to add new attributes to the database at query time. However, the number of croud-sourced microtasks to fill in missing values may often be prohibitively large and the resulting data quality is doubtful. Instead of simple crowd-sourcing to obtain all values individually, we leverage the user-generated data found in the Social Web: by exploiting user ratings we build perceptual spaces, i.e., highly-compressed representations of opinions, impressions, and perceptions of large numbers of users. Using few training samples obtained by expert crowd sourcing, we then can extract all missing data automatically from the perceptual space with high quality and at low costs. Extensive experiments demonstrate that our approach can be applied to boost both performance and quality of crowd-enabled databases, while also providing the flexibility to expand schemas in a query-driven fashion.

267_camera-ready-v3.pdf1.25 MB