Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS

Unified Theory of Efficient Optimization and Estimation

Periodic Reporting for period 3 - UTOPEST (Unified Theory of Efficient Optimization and Estimation)

Reporting period: 2022-03-01 to 2023-08-31

Optimization and estimation are fundamental classes of computational problems at the heart of some of the most prominent current computing applications, e.g. machine learning and data science.
Unfortunately, only for a few, severely restricted cases, do we understand what kinds of guarantees efficient algorithms are able achieve.
In most other cases, there is a significant gap between the known lower and upper bounds on the guarantees achievable by efficient algorithms.
These cases leave open the possibility of new kinds of efficient algorithms with significantly stronger guarantees than current ones or the possibility of new kinds of inherent limitations on the guarantees of efficient algorithms.
This project aims to make progress toward a unified algorithmic theory that for a wide range of optimization and estimation problems provides matching lower and upper bounds on the guarantees of efficient algorithms.

Our starting point for such a theory is the sum-of-squares method.
This method provides for any suitably encoded optimization or estimation problem a sequence of efficient algorithms based on semidefinite programming.
In many cases, these algorithms significantly generalize the best previous ones and open up in this way the possibility to achieve substantially stronger guarantees.
Indeed, prior work by the PI and collaborators brought this possibility to fruition for basic clustering and robust moment estimation problems.
At the same time, the sum-of-squares method also allows us to reason about new kinds of inherent limitations of efficient algorithm.
The reason is that it turns out to provide a dual perspective on algorithms in form of the so-called sum-of-squares proof system.
This proof system strikes a delicate balance: it is powerful enough to capture large classes of potential algorithms while still being amenable to impossibility results that demonstrate inherent limitations of such algorithms.
During this period, the work on this project has focused on the issue of robustness.
Here, we desire the guarantees of our algorithms to hold not only within the narrow confines of a typical statistical model but even in face of adversarial changes to the data.
Many state-of-the-art algorithms, both in theory and in practice, turn out to be fragile even against severely restricted kinds of adversaries.
This observation raises the question if we can design better algorithms with more robust guarantees or if there is an inherent price of robustness.

In a work published at FOCS 2020, we investigate sparse principal component analysis from the point of view of robustness.
We demonstrate that the previous best algorithms for a large parameter regime are fragile in face of small adversarial perturbations.
Our work shows that sum-of-squares techniques can overcome many of these limitations and can, in the presence of such perturbations, achieve guarantees that are conjectured to be close to optimal even if there were no perturbations.
What underlies our result is a general property of algorithms based on the sum-of-squares method:
Unlike many other algorithms, sum-of-squares can certify bounds on optimization problems associated with the estimation task at hand.
These kinds of certificates often automatically imply certain robustness properties for the corresponding algorithms.

In a work published at FOCS 2021, we investigate the issue of robustness of the problem of detecting communities in stochastic block-model graphs.
For the most basic case of two communities, we know efficient algorithms for this problem that work all the way up to the information-theoretic threshold.
Again it turns out that these algorithms are fragile and do not achieve any substantial guarantees even if only a tiny fraction of the edges of the input graph is adversarially altered.
Interestingly, it turns out that other kinds of algorithm can tolerate these kinds of corruptions but those algorithms appear to work only up to a point bounded away from the aforementioned threshold.
Indeed, some researchers suspected that corruptions of this kind may inherently alter the threshold at which it is possible at all to approximately recover the hidden communities.
Our work proves his suspicion wrong and develops a new kind of algorithm works all the way up to the threshold even in the presence of a constant fraction of adversarial edge changes.
One of the goals for the remaining period of the project is to obtain more a unified perspective on how to algorithmically deal with the issue of robustness.
The two works described in the previous section of the summary address this issue in fundamentally different ways.
While our first work (on robust algorithms for sparse principal component analysis) fits well into the kind of unified theory proposed in the first part of this summary, our second work (on robust recovery for stochastic block models close to the threshold) proposes an algorithm that doesn't appear to be easily captured by the kind of unified theory described before.
(The underlying technical reason is that the underlying optimization problem is not a relaxation of the maximum likelihood objective.)
Hence, a significant goal for the remaining period of the project is to either identify a different robust algorithm for stochastic block models captured by our unified theory.
Or, failing that, to broaden our notion of unified theory so as to capture in a natural way the existing algorithm for this problem.

Finally, we have obtained some preliminary results for the problem of learning non-spherical Gaussian mixtures with many components, where we are able to improve an exponential dependency on the number of components in previous algorithms to a quasi-polynomial dependency on the number of components.
By the end of the project, we expect to be able to provide a more complete picture of the complexity landscape of learning non-spherical Gaussian mixture models.