Given a small number of randomly sampled entries of a low-rank matrix, is it possible to reconstruct the entire matrix? Recent work has demonstrated the remarkable fact that indeed such reconstruction is possible, and moreover, can be done using convex optimization. Such problems arise in many applications, including partially observed PCA, and, notably, in collaborative filtering problems, increasingly popular in e-commerce and elsewhere.
A confounding fact, however, is that if even a single column of this matrix is corrupted (e.g., due to a single malicious user in collaborative filtering), the output of existing algorithms can be arbitrarily skewed from the true matrix. Partial observation precludes a priori identification of corrupted columns vs good columns.
This talk considers precisely this problem: matrix completion, when some (potentially large) number of columns are arbitrarily (perhaps adversarially) corrupted. We show that with a vanishing fraction of observed entries, it is nevertheless possible to succeed in performing matrix completion over the uncorrupted columns, and identifying the corrupted ones, even when the number of corrupted columns grows. Our algorithm is convex optimization-based, and in particular, can handle large-size problems. Our results hold without any assumptions on the number, locations or values of the observed entries of the manipulated columns.
Based on joint work with: Yudong Chen, Huan Xu and Sujay Sanghavi.