Synergizing Neural Network Theory and Combinatorial Optimization via Extension Complexity

Informacje na temat projektu

NeurExCo

Identyfikator umowy o grant: 101153187

DOI

10.3030/101153187

Realizacja projektu zakończyła się 31 Grudnia 2024

Data podpisania przez KE 1 Marca 2024

Data rozpoczęcia 1 Kwietnia 2024

Data zakończenia 31 Marca 2026

Finansowanie w ramach

Marie Skłodowska-Curie Actions (MSCA)

Koszt całkowity

Brak danych

Wkład UE

€ 175 920,00

Koordynowany przez

UNIVERSITE LIBRE DE BRUXELLES
Belgium

Periodic Reporting for period 1 - NeurExCo (Synergizing Neural Network Theory and Combinatorial Optimization via Extension Complexity)

Okres sprawozdawczy: 2024-04-01 do 2026-03-31

Artificial intelligence is changing our lives. Artificial neural networks are present and entering various fields of modern technology such as medicine, engineering, education and many more. Even a small-scale theoretical understanding of why and how neural networks succeed in practice can have a considerable impact on the future development of such technologies.

In contrast, combinatorial optimization is a well-established discipline at the intersection of mathematics and computer science, dealing with classical algorithmic questions like the Shortest Path or Traveling Salesperson Problems. A powerful tool to study structural and algorithmic properties of combinatorial optimization problems is polyhedral geometry. For example, the geometric notion of extension complexity classifies how well a specific problem can be expressed and solved via an extremely successful general-purpose technique called linear programming.

Recent developments show that polyhedral theory can also be a powerful tool to achieve a better mathematical understanding of neural networks. The overall goal of this project is to significantly intensify the connection between neural networks and polyhedral theory, using the concept of extension complexity. This new symbiosis has the potential to advance both, the theoretical understanding of neural networks as well as the fundamental understanding of classical combinatorial optimization problems. On the side of neural networks, the goal is to obtain new bounds on the required size and depth to solve a given problem, serving as an explanation of why large and deep neural networks are more successful in practice. On the side of combinatorial optimization, generalized notions of extension complexity inspired by neural networks will lead to new structural and algorithmic insights to classical problems like the matching problem.

The research performed in the course of this project can be divided into two subprojects.

Subproject A: This subproject is closest to the original proposal and the actual core theme of the project. We investigated to what extent the extension complexity, which has previously been used to prove limits of linear programming, can also be used to prove limits of neural networks and therefore modern AI. As this seems to be a challenging task for general neural networks, we also investigated monotone neural networks, that is neural networks which are not allowed to subtract. For monotone neural networks, the task to prove lower bounds via extension complexity seemed to be more promising. We furthermore aimed towards generalizing the notion of extension complexity such that, on one hand, it still has a meaningful interpretation within combinatorial optimization, and, on the other hand, also has the strength to lower-bound general (non-monotone) neural networks.

Subproject B: This second subproject is focused on a more technical and geometric question. In the course of the research related to Subproject A, it became apparent that, mathematically, subtraction makes neural networks very powerful. Without subtraction, a neural network would always represent a convex piecewise linear function, while subtraction provides the power to represent non-convex functions. Consequently, in Subproject B, we investigated the relation of convex and non-convex piecewise linear functions. It has been well-known before that every non-convex piecewise linear function can be represented as a difference of two convex onces, but it remained an open question, how much the complexity of the considered functions might increase in doing so.

In both subprojects, the conducted research led to new scientific results which are published as preprints and submitted for peer review. The preprint related to Subproject B was accepted to ICLR 2025, a top-tier machine learning conference.

Subproject A: The related preprint can be found here: Hertrich, Christoph, and Georg Loho. "Neural Networks and (Virtual) Extended Formulations." arXiv preprint arXiv:2411.03006 (2024). https://arxiv.org/abs/2411.03006.
As a result of this subproject, we were able to prove that lower bounds on the extension complexity can in fact easily be transfered to monotone neural networks. Moreover, we introduced a new geometric measure for polytopes, the virtual extension complexity, as a generalization of the ordinary extension complexity, and showed that this quantity is capable of lower-bounding general neural networks. This quantity is of general interest not only because it helps to understand neural networks as a modern computing technology, but also independently in combinatorial optimization, where we showed that a low virtual extension complexity implies that one can efficiently optimize over the corresponding polytope.

Subproject B: The related preprint can be found here: Brandenburg, Marie-Charlotte, Moritz Grillo, and Christoph Hertrich. "Decomposition Polyhedra of Piecewise Linear Functions." arXiv preprint arXiv:2410.04907 (2024). https://arxiv.org/abs/2410.04907. Accepted to International Conference on Learning Representations 2025.
As a result of this subproject, we gave a polyhedral characterization of the set of possible representations of a non-convex piecewise linear function as a difference of two convex such functions. These descriptions have impact both on the theory of optimization and neural networks.

Both subprojects are purely focused on scientific, theoretical impact, but target multiple communities in mathematics and computer science, from machine learning via polyhedral geometry to optimization. Since both projects also open up multiple follow-up questions that are relevant in these communities, we expect our results to have long-lasting scientific impact not only within the communities, but also towards the mission to intensify bridges between the communities.

Periodic Reporting for period 1 - NeurExCo (Synergizing Neural Network Theory and Combinatorial Optimization via Extension Complexity)

Udostępnij tę stronę Udostępnij tę stronę w mediach społecznościowych

Pobierz Pobierz zawartość strony