Today, crowdsourcing has emerged as a promising paradigm for annotating, structuring, and managing Web data. Still, as long as the problem of the crowd workers' trustworthiness in terms of result quality is not essentially solved, all these efforts remain doubtful. Therefore, in this paper we look at today's dominant quality assurance techniques and investigate how they cope with Web data, i.e. typical long-tail distributions, making it easy for strategic spammers to guess the prevalent answers and thus to go undetected. We provide a thorough theoretical analysis, quantifying the success of different methods on such skewed domains by means of test theory and show their individual weaknesses. Exploiting our case study analysis, we propose a simple privacy-preserving, task-agnostic model to improve test reliability, while actually decreasing overhead costs for quality assurance. Finally, we show the stability of our method for even higher numbers of spammers in controlled crowdsourcing experiments.
|