Nonconvex optimization is becoming one of the most powerful workhorses of data science and artificial intelligence. Compared with convex optimization, it enjoys superior statistical accuracy, computational efficiency, and modeling flexibility in numerous modern settings. However, the empirical success of nonconvex optimization largely eludes the reach of the classical statistical and optimization frameworks, which prohibits us from designing more efficient algorithms in a principled manner.
In this talk, I will illustrate how statistical thinking enables us to harness the power of nonconvex optimization. In specific, I will present an algorithmic framework for exploiting the latent geometry induced by the randomness of data. By integrating three new global exploration meta-algorithms — namely, homotopy continuation, tightening after relaxation, and noise regularization — with local search heuristics — such as the variants of gradient descent — this unified framework leads to new nonconvex optimization algorithms for a broad variety of challenging learning problems. In particular, these algorithms enjoy provably optimal statistical accuracy and computational efficiency, and moreover, lead to new scientific discoveries. Time permitting, I will discuss an interesting “more data, less computation” phenomenon, which arises from nonconvex optimization, but generalizes to even more algorithms.
Zhaoran Wang is a graduate student at Princeton University, working at the interface of machine learning, statistics, and optimization. He is the recipient of the AISTATS (Artificial Intelligence and Statistics Conference) notable paper award, ASA (American Statistical Association) best student paper in statistical learning and data mining, INFORMS (Institute for Operations Research and the Management Sciences) best student paper finalist in data mining, and the Microsoft fellowship.