Skip to main content

Sublinear Optimization for Machine Learning

Final Report Summary - SUBLINEAROPTML (Sublinear Optimization for Machine Learning)

This final report pertains to project ”Sublinear Optimization for Machine Learning” carried out at the Technion,
PI: Elad Hazan, and generously funded through the Marie Curie program.
Can we solve optimization problems in using computational resources proportional to the information necessary to represent and verify the solution? The interplay between information and computation is at the core of the hardest problems of computer science and mathematics. However, statistical problems arising in machine learning exhibit a much more attainable version of this question. In the past few years we have been able to successfully design algorithms that run in time proportional to the information theoretic limit of verifying a solution. These algorithms run in time which is lesser than time to perform even a linear scan of the input, thus are called sublinear optimization algorithms.
The main investigation topic of our project is to develop provably correct algorithms which run in sublinear time and/or space - i.e. they do not observe all the data even once, hence applicable to super-scale problems.

Specific objectives listed in the project proposal are:
1. Develop a sublinear-time solver for the SVM (support vector machine) optimization problem. Our solver will depend sublinearly on the two main complexity parameters of the corresponding optimization problem: the number of examples and the dimension. Current solvers have at least linear dependence on both of these parameters. Two variants of this tool were considered, soft margin SVM, and ”Lasso”.
2. Enable efficient use of Kernels: develop sublinear time algorithms for classification using kernels, i.e. mappings to high dimensional space, whose structure enables efficient inner-product computation. The kernel method allows one to compute non-linear classifiers such as polynomial classifiers and Gaussian classifiers.
3. Develop sublinear algorithms for PCA (principle component analysis) and matrix factorization. These matrix operations are the basic building blocks of collaborative filtering and other methods for recommendation systems.
All three components of these investigation areas were researched during our project. In particular, we have attained the following results (which were all published in broad audience top-tier machine learning venues):
1. The first sublinear soft-margin SVM algorithm was designed, implemented and benchmarked. A publication describing this development was accepted and published in the proceedings of NIPS (a major annual conference of machine learning) with the title of ”Beating SGD: Learning SVMs in Sublinear Time”.
2. The above paper discusses applications to kernels, along the lines of the original sublinear algorithms for machine learning, which were enhanced with full detail to application to various kernels including the Gaussian and polynomial kernels in the paper titled ”Sublinear optimization for machine learning”, JACM 2012.
3. Application of sublinear time technics to the operator and matrix world were explored with respect to application to semi-definite programming. In particular, we have developed the first sublinear time algorithm for semi-definite programming. A publication describing this result was accepted and published in the proceedings of NIPS 2011, titled ”Approximating Semidefinite Programs in Sublinear Time”.
4. We have extended the sublinear time framework to consider optimization and learning with partial access to information. A publication detailing near-optimal algorithms for learning with partial attributes was presented in ICML 2012, titled ”Linear Regression with Limited Observation”.
Another area of research explored in this project is projection free optimization and learning. The computational bottleneck in applying state-of-the-art iterative methods to machine learning and optimization is often the so-called ”projection step”. As part of this research project we have designed projection-free optimization algorithms that replaces projections by more efficient linear optimization steps. Specific results include a projection-free algorithm for online learning, a research that was presented and published in the 29th International Conference on Machine Learning (ICML 2012) , and the first linearly convergent projection-free algorithm, a research presented and published in the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2013).
Eliminating projections and replacing them by linear optimization was shown to give faster algorithms for the application of matrix completion, a predominant technology in recommendation systems.