Conferences: Robust Clustering of Data Collected Via Crowdsourcing
A. Pagès Zamora, Giannakis, López Valcarce and Giménez Febrer

Abstract

Crowdsourcing approaches rely on the collection of multiple individuals to solve problems that require analysis of large data sets in a timely accurate manner. The inexperience of participants or annotators motivates well robust techniques. Focusing on clustering setups, the data provided by all annotators is suitably modeled here as a mixture of Gaussian components plus a uniformly distributed random variable to capture outliers. The proposed algorithm is based on the expectation-maximization algorithm and allows for soft assignments of data to clusters, to rate annotators according to their performance, and to estimate the number of Gaussian components in the non-Gaussian/Gaussian mixture model, in a jointly manner.


Full document




©UPC Universitat Politècnica de Catalunya
Signal Processing and Communications group
Powered by Joomla!.