Periodic Reporting for period 1 - NeurExCo (Synergizing Neural Network Theory and Combinatorial Optimization via Extension Complexity)
Okres sprawozdawczy: 2024-04-01 do 2026-03-31
In contrast, combinatorial optimization is a well-established discipline at the intersection of mathematics and computer science, dealing with classical algorithmic questions like the Shortest Path or Traveling Salesperson Problems. A powerful tool to study structural and algorithmic properties of combinatorial optimization problems is polyhedral geometry. For example, the geometric notion of extension complexity classifies how well a specific problem can be expressed and solved via an extremely successful general-purpose technique called linear programming.
Recent developments show that polyhedral theory can also be a powerful tool to achieve a better mathematical understanding of neural networks. The overall goal of this project is to significantly intensify the connection between neural networks and polyhedral theory, using the concept of extension complexity. This new symbiosis has the potential to advance both, the theoretical understanding of neural networks as well as the fundamental understanding of classical combinatorial optimization problems. On the side of neural networks, the goal is to obtain new bounds on the required size and depth to solve a given problem, serving as an explanation of why large and deep neural networks are more successful in practice. On the side of combinatorial optimization, generalized notions of extension complexity inspired by neural networks will lead to new structural and algorithmic insights to classical problems like the matching problem.
Subproject A: This subproject is closest to the original proposal and the actual core theme of the project. We investigated to what extent the extension complexity, which has previously been used to prove limits of linear programming, can also be used to prove limits of neural networks and therefore modern AI. As this seems to be a challenging task for general neural networks, we also investigated monotone neural networks, that is neural networks which are not allowed to subtract. For monotone neural networks, the task to prove lower bounds via extension complexity seemed to be more promising. We furthermore aimed towards generalizing the notion of extension complexity such that, on one hand, it still has a meaningful interpretation within combinatorial optimization, and, on the other hand, also has the strength to lower-bound general (non-monotone) neural networks.
Subproject B: This second subproject is focused on a more technical and geometric question. In the course of the research related to Subproject A, it became apparent that, mathematically, subtraction makes neural networks very powerful. Without subtraction, a neural network would always represent a convex piecewise linear function, while subtraction provides the power to represent non-convex functions. Consequently, in Subproject B, we investigated the relation of convex and non-convex piecewise linear functions. It has been well-known before that every non-convex piecewise linear function can be represented as a difference of two convex onces, but it remained an open question, how much the complexity of the considered functions might increase in doing so.
Subproject A: The related preprint can be found here: Hertrich, Christoph, and Georg Loho. "Neural Networks and (Virtual) Extended Formulations." arXiv preprint arXiv:2411.03006 (2024). https://arxiv.org/abs/2411.03006(odnośnik otworzy się w nowym oknie).
As a result of this subproject, we were able to prove that lower bounds on the extension complexity can in fact easily be transfered to monotone neural networks. Moreover, we introduced a new geometric measure for polytopes, the virtual extension complexity, as a generalization of the ordinary extension complexity, and showed that this quantity is capable of lower-bounding general neural networks. This quantity is of general interest not only because it helps to understand neural networks as a modern computing technology, but also independently in combinatorial optimization, where we showed that a low virtual extension complexity implies that one can efficiently optimize over the corresponding polytope.
Subproject B: The related preprint can be found here: Brandenburg, Marie-Charlotte, Moritz Grillo, and Christoph Hertrich. "Decomposition Polyhedra of Piecewise Linear Functions." arXiv preprint arXiv:2410.04907 (2024). https://arxiv.org/abs/2410.04907(odnośnik otworzy się w nowym oknie). Accepted to International Conference on Learning Representations 2025.
As a result of this subproject, we gave a polyhedral characterization of the set of possible representations of a non-convex piecewise linear function as a difference of two convex such functions. These descriptions have impact both on the theory of optimization and neural networks.
Both subprojects are purely focused on scientific, theoretical impact, but target multiple communities in mathematics and computer science, from machine learning via polyhedral geometry to optimization. Since both projects also open up multiple follow-up questions that are relevant in these communities, we expect our results to have long-lasting scientific impact not only within the communities, but also towards the mission to intensify bridges between the communities.