Abstract
Active learning is an iterative supervised learning task where learning algorithms can actively query an oracle, i.e. a human annotator that understands the nature of the problem, to obtain the ground truth. The motivation behind this approach is to allow the learner to interactively choose the data it will learn from, which can lead to significantly less annotation cost, faster training and improved performance. Active learning is appropriate for machine learning applications where labeled data is costly to obtain but unlabeled data is abundant. Most importantly, it permits a learning model to evolve and adapt to new data unlike conventional supervised learning. Although active learning has been widely considered for single-label learning, applications to multi-label learning have been more limited. In this work, we present the general framework to apply active learning to multi-label data, discussing the key issues that need to be considered in pool-based multi-label active learning and how existing solutions in the literature deal with each of these issues. We further propose a novel aggregation method for evaluating which instances are to be annotated. Extensive experiments on 13 multi-label data sets with different characteristics and under two different applications settings (transductive, inductive) convey a consistent advantage of our proposed approach against the rest of the approaches and, most importantly, against passive supervised learning and reveal interesting aspects related mainly to the properties of the data sets, and secondarily to the application settings.
http://ift.tt/2gpRRGY
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου