Briggs, Forrest; Fern, Xiaoli Z.; Raich, Raviv; Lou, Qi. 2013. Instance Annotation for Multi-Instance Multi-Label Learning. ACM Transactions on Knowledge Discovery from Data 7(3), 30p. doi:https://doi.org/http://dx.doi.org/10.1145/2500491
Multi-instance multi-label learning (MIML) is a framework for supervised classi?cation where the objects to be classi?ed are bags of instances associated with multiple labels. For example, an image can be represented as a bag of segments and associated with a list of objects it contains. Prior work on MIML has focused on predicting label sets for previously unseen bags. We instead consider the problem of predicting instance labels while learning from data labeled only at the bag level. We propose a regularized rank-loss objective designed for instance annotation, which can be instantiated with different aggregation models connecting instance-level labels with bag-level label sets. The aggregation models that we consider can be factored as a linear function of a “support instance” for each class, which is a single feature vector representing a whole bag. Hence we name our proposed methods rank-loss Support Instance Machines (SIM). We propose two optimization methods for the rank-loss objective, which is nonconvex. One is a heuristic method that alternates between updating support instances, and solving a convex problem in which the support instances are treated as constant. The other is to apply the constrained concave-convex procedure (CCCP), which can also be interpreted as iteratively updating support instances and solving a convex problem. To solve the convex problem, we employ the Pegasos framework of primal subgradient descent, and prove that it ?nds an -suboptimal solution in runtime that is linear in the number of bags, instances, and 1. Additionally, we suggest a method of extending the linear learning algorithm to nonlinear classi?cation, without increasing the runtime asymptotically. Experiments on arti?cial and real-world datasets including images and audio show that the proposed methods achieve higher accuracy than other loss functions used in prior work, e.g., Hamming loss, and recent work in ambiguous label classi?cation.