Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Machine learning and the physics of complex and disordered systems

Periodic Reporting for period 2 - COMPLEX ML (Machine learning and the physics of complex and disordered systems)

Período documentado: 2022-09-01 hasta 2023-08-31

Physicists aim to derive a description of complex phenomena, often involving astronomical numbers of interacting electrons, atoms, molecules or other constituents in terms of only a handful of relevant quantities. In doing so they, in effect, "compress" the full description of the system into a succinct theory, providing an understanding of the phenomenon and its properties, and possessing predictive power. Surprisingly, a systematic path towards that goal exists, technically known as the Renormalization Group. It is, however, very difficult to perform it in practice, especially for disordered or irregular systems, which are ubiquitous in nature. Examples include quasicrystals, cell assemblies in human brain or interactions between participants of a social network, and form a part of what is collectively referred to as complex systems. Complex systems thus comprise a vast class of phenomena from atomistic and chemical scales, through biology, all the way to social interactions and properties of industrial energy networks.
As such their improved understanding is of fundamental importance and benefit to the society.

The key objective of the interdisciplinary COMPLEX ML project is to construct new analytical and computational tools helping to develop theoretical descriptions of complex systems. To this end we take the suggestive "compression" metaphor seriously, and,
together with collaborators, introduce a new approach to constructing effective theories based on the theory of data compression, originating in computer science. The computational progress is based on close connections with developments in the field of Machine Learning (ML), which over the course of the past ten years have revolutionised many areas of engineering. In the COMPLEX ML project we use such techniques to automate parts of the scientific discovery process. Conversely, we aim to improve to improve certain aspects of ML algorithms themselves, using the established connections to complex systems physics.
Two key steps were necessary to achieve the objectives of COMPLEX ML. In the first one, with collaborators in Jerusalem, we developed new theoretical tools to analyse complex systems. We established a connection between the traditional formal language in which concepts of theoretical physics are expressed, the field theory, and the language of theory of data compression from computer science, part of a broader field of information theory [Phys.Rev.Lett. 126, 240601 (2021)]. We showed that what is "relevant" in data from the point of view of constructing a physical theory is exactly the same as what is "relevant" from the point of view of lossy data compression. This connection is a "Rosetta Stone", allowing to translate physical concepts and quantities into a form more compatible with computational methods of Machine Learning.

In the second step, we used this theoretical result to build a numerical algorithm extracting "relevant quantities" from complex system data based on compression theory. To this end, with collaborators in Zurich, we developed the RSMI-NE code package, written in Python/Tensorflow, and made it publicly available in a GitHub repository [https://github.com/RSMI-NE/RSMI-NE] and as a Python package. We applied this algorithm, based on our theory and state-of-art results in Machine Learning, to an important statistical mechanical model [Phys.Rev.Lett. 127, 240603 (2021)], demonstrating that full theoretical understanding of a system can be obtained in semi-automatic fashion using RSMI-NE. We showed that our methods also provide insights into the symmetry properties of the system [Phys.Rev. E 104, 064106 (2021)].

Having achieved the first part of our objective, implementing novel compression-theory based methods for regular and ordered systems, in the second phase the effort has been to extend these tools to the challenging disordered and inhomogenous cases. To this end a new version of the RSMI-NE code has been implemented. Together with collaborators in Cardiff, Oxford, Jerusalem and Zurich we used it to study a strongly correlated system on a quasicrystal, discovering novel emergent phenomena [arXiv:2301.11934]. This achieved a major goal for COMPLEX ML, providing information about physical systems beyond what is already known and accessible with other methods. We further demonstrated how information-theoretic approaches can be used to construct simplified models of dynamics, and in precision numerics in 3D lattice gauge theories, which are of theoretical importance. Publications summarising these results are in preparation.

A complementary component of the COMPLEX ML project was derivation of new insights to improve ML algorithms. We focused on the question of gradient-free training of Binarized Neural Networks (BNNs), which are of great practical interest due to the potential of orders-of-magnitude savings on the cost (and energy use) of training, compared to their full-precision counterparts. Together with collaborators in Ukraine, working in industry, we investigated the idea of applying physical Monte Carlo methods to this problem and presented proof-of-concept results in the "Binary Networks for Computer Vision" workshop at the CVPR 2021 conference in Machine Learning.

Our results were published in peer-reviewed journals and presented in conferences, workshops and invited talks in Europe and USA. Outreach activities involving primary and high-school students were performed within the "Science is wonderful" programme.
New theoretical tools have been developed for the analysis of complex statistical systems, based on the formalism of compression theory [Phys.Rev.Lett. 126, 240601 (2021)]. This is of stand-alone theoretical importance, but we also used it to introduce a novel algorithmic approach to constructing theoretical descriptions of complex systems. To this end we combined our theoretical insights with state-of-art results in Machine Learning to develop the RSMI-NE algorithm [Phys.Rev. E 104, 064106 (2021), Phys.Rev. E 104, 064106 (2021), https://github.com/RSMI-NE/RSMI-NE](se abrirá en una nueva ventana) which is in many aspects beyond state-of-art. The scope of information it provides about a physical system, untainted by prior knowledge, is only comparable to diagonalization methods, which can only be applied to tiny simulated systems, whereas RSMI-NE can be applied to high dimensional systems and even experimental data.

In the second phase of the COMPLEX ML project these results have been extended to inhomogeneous and dynamical systems. In particular the RSMI-NE package has been extended to systems on arbitrary static graphs, fulfilling the main goal of the proposal, i.e. development of novel techniques applicable to systems with correlated disorder. We applied these tools to strongly-correlated systems on quasicrystals, discovering new emergent phenomena [arXiv:2301.11934]. We further demonstrated how information-theoretic approaches can be used to construct simplified models of dynamics, and to advance precision numerics in there-dimensional lattice gauge theories, which are of theoretical importance. The publications summarising the last two results are currently being prepared.

The final result of the COMPLEX ML project has thus been the development, both theoretical and on code level, of an entirely novel class of numerical approaches to constructing effective description of complex systems, and their validation in systems of current research interest.
The RSMI-NE algorithm outline.
Mi folleto 0 0