## Periodic Reporting for period 1 - SpTheoryGraphLim (Spectral Theory of Graph Limits)

**Reporting period:**2015-05-01

**to**2017-04-30

## Summary of the context and overall objectives of the project

"Networks with a large number of interacting nodes come up in various branches of sciences. There is a rapidly growing need to understand their behavior and properties, and to work out algorithms for them. The problem is, however, that real-life networks tend to be too large to use standard graph theoretic tools and methods. Recently new areas of mathematics (such as graph convergence and parallel algorithms) have been developed in order to address this problem. This project aimed to study these areas, with particular emphasis on their spectral aspects.

In parallel algorithms the idea is to distribute the algorithm among the nodes of the network and therefore have a constant running time. The project focused on certain randomized local algorithms (called factor of IID processes in probability theory).

One of the key techniques that the project used and extended is entropy inequalities. They provide constraints for what can be achieved by randomized local algorithms. The Shannon entropy measures the ""uncertainty"" of a random state. One can consider the entropy of the random output of a local algorithm for different sets of nodes. It turned out that certain inequalities are satisfied between these entropies. Such inequalities played a central role in a few remarkable results recently, e.g. the Backhausz-Szegedy result on the eigenvectors of random regular graphs.

The main results of this project include the analysis of how independent different parts of the output of a randomized local algorithm are. Different aspects of independence were considered: correlation, mutual information. The project also made progress in developing new entropy inequalities: an approach was found that provides a recipe for how to find and prove entropy inequalities."

In parallel algorithms the idea is to distribute the algorithm among the nodes of the network and therefore have a constant running time. The project focused on certain randomized local algorithms (called factor of IID processes in probability theory).

One of the key techniques that the project used and extended is entropy inequalities. They provide constraints for what can be achieved by randomized local algorithms. The Shannon entropy measures the ""uncertainty"" of a random state. One can consider the entropy of the random output of a local algorithm for different sets of nodes. It turned out that certain inequalities are satisfied between these entropies. Such inequalities played a central role in a few remarkable results recently, e.g. the Backhausz-Szegedy result on the eigenvectors of random regular graphs.

The main results of this project include the analysis of how independent different parts of the output of a randomized local algorithm are. Different aspects of independence were considered: correlation, mutual information. The project also made progress in developing new entropy inequalities: an approach was found that provides a recipe for how to find and prove entropy inequalities."

## Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far

The main results of the project are concerned with factor of IID processes. Such a process essentially describes the output of a randomized local algorithm on a regular tree (i.e. an infinite network with the property that it contains no cycles and each node has the same number of connections/neighbors).

It is an interesting question how independent the states (i.e. the random outputs on different nodes) of these algorithms are. A sharp upper bound had been known on the correlation of the states of two nodes at a given distance. In this project this result was extended: if one restricts such a process to two distant connected subgraphs of the tree, then the two parts are basically uncorrelated. The order of this correlation was determined in terms of the distance of the subgraphs. This is a quantitative version of the fact that factor of IID processes have trivial 1-ended tails. In the proof the spectral properties of the graph limit object, which is the regular tree in our case, were exploited in the form of the spectrum of the so-called non-backtracking operator.

Another notion that measures independence is mutual information known from information theory. Mutual information has the advantage over correlation that the latter only detects linear dependence. Therefore it is natural to study the mutual information of the states of two given nodes. If the nodes are connected (i.e. their distance is 1), then a known inequality provides an upper bound for the (normalized) mutual information. In this project upper bounds were obtained for nodes at an arbitrary distance. Although these bounds are sharp, it was also shown that an interesting phenomenon occurs here: for any fixed process the rate of decay of the mutual information is much faster. In other words, the order of the mutual information is different for a fixed process and for any process.

The project also made progress in the topic of entropy inequalities. Given a process, one can assign entropies to different finite subgraphs of a regular tree. There are linear inequalities between these entropies that hold for any factor of IID process. A new approach for finding and proving entropy inequalities was obtained in this project. The key tool in the proof is a generalization of the edge-vertex inequality for a broader class of factor processes with fewer symmetries.

This edge-vertex inequality was further generalized in the final months of the project: a hypergraph version was proved. A hypergraph is a network where edges/connections might involve more than one vertex/node. The paper containing these results is in preparation.

Dissemination:

Harangi has finished three research papers in this project. Two of them have already been accepted for publication. All three papers have been accepted by or submitted to Q1 (first quartile) journals. One further paper is in preparation.

Harangi also presented the results of this project in several research talks in various universities.

Harangi was invited to give a talk at a public event organized and hosted by the Hungarian Academy of Sciences where he spoke about the research carried out during the period of this project and about his experience with applying for and participating in a Marie Sklodowska-Curie grant.

Harangi was interviewed by an online mathematics portal (ematlap.hu). This was a perfect opportunity for him to popularize his research to a wider audience (not exclusively consisting of researchers but also high school and university students/teachers, and other mathematics enthusiasts).

For details see the project website www.renyi.hu/~harangi/msc/ listing research papers, talks (with slides and even video links when available), public engagement activities, etc.

It is an interesting question how independent the states (i.e. the random outputs on different nodes) of these algorithms are. A sharp upper bound had been known on the correlation of the states of two nodes at a given distance. In this project this result was extended: if one restricts such a process to two distant connected subgraphs of the tree, then the two parts are basically uncorrelated. The order of this correlation was determined in terms of the distance of the subgraphs. This is a quantitative version of the fact that factor of IID processes have trivial 1-ended tails. In the proof the spectral properties of the graph limit object, which is the regular tree in our case, were exploited in the form of the spectrum of the so-called non-backtracking operator.

Another notion that measures independence is mutual information known from information theory. Mutual information has the advantage over correlation that the latter only detects linear dependence. Therefore it is natural to study the mutual information of the states of two given nodes. If the nodes are connected (i.e. their distance is 1), then a known inequality provides an upper bound for the (normalized) mutual information. In this project upper bounds were obtained for nodes at an arbitrary distance. Although these bounds are sharp, it was also shown that an interesting phenomenon occurs here: for any fixed process the rate of decay of the mutual information is much faster. In other words, the order of the mutual information is different for a fixed process and for any process.

The project also made progress in the topic of entropy inequalities. Given a process, one can assign entropies to different finite subgraphs of a regular tree. There are linear inequalities between these entropies that hold for any factor of IID process. A new approach for finding and proving entropy inequalities was obtained in this project. The key tool in the proof is a generalization of the edge-vertex inequality for a broader class of factor processes with fewer symmetries.

This edge-vertex inequality was further generalized in the final months of the project: a hypergraph version was proved. A hypergraph is a network where edges/connections might involve more than one vertex/node. The paper containing these results is in preparation.

Dissemination:

Harangi has finished three research papers in this project. Two of them have already been accepted for publication. All three papers have been accepted by or submitted to Q1 (first quartile) journals. One further paper is in preparation.

Harangi also presented the results of this project in several research talks in various universities.

Harangi was invited to give a talk at a public event organized and hosted by the Hungarian Academy of Sciences where he spoke about the research carried out during the period of this project and about his experience with applying for and participating in a Marie Sklodowska-Curie grant.

Harangi was interviewed by an online mathematics portal (ematlap.hu). This was a perfect opportunity for him to popularize his research to a wider audience (not exclusively consisting of researchers but also high school and university students/teachers, and other mathematics enthusiasts).

For details see the project website www.renyi.hu/~harangi/msc/ listing research papers, talks (with slides and even video links when available), public engagement activities, etc.

## Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)

The progress made in this project helps us to understand what can be achieved by randomized local algorithms on large networks. The main results provide constraints for factors of IID processes. The rate of correlation decay had been known for these processes. In this project this result was extended for the correlation of distant connected subgraphs. Another aspect of independence was also considered (mutual information) and the rate of decay was determined in this case as well. New entropy inequalities were also found providing further constraints for these local algorithms. Entropy inequalities had already been a useful tool. The new inequalities obtained in this project will hopefully have further consequences, making it a truly powerful tool in answering questions about such algorithms.

This research topic is at the meeting point of many different mathematical disciplines (such as graph theory, probability theory, group theory, dynamical systems). This interdisciplinary nature of the project makes it likely that the results will find their applications in various areas. The tools developed in this project are expected to have many applications in mathematics.

As for applications outside mathematics, the questions studied in this project are not directly motivated by real-life applications. The project proposed basic research aiming to develop an abstract mathematical theory. Having said that, the original motivation behind this area is real-life networks turning up in various scientific fields. Therefore a deep understanding of this topic can help comprehending such social, biological, or computer networks better. Considering the intimate connection of the topic and the notions to large networks and parallel algorithms, one certainly expects that developing this theory will lead to real-life applications as well.

This research topic is at the meeting point of many different mathematical disciplines (such as graph theory, probability theory, group theory, dynamical systems). This interdisciplinary nature of the project makes it likely that the results will find their applications in various areas. The tools developed in this project are expected to have many applications in mathematics.

As for applications outside mathematics, the questions studied in this project are not directly motivated by real-life applications. The project proposed basic research aiming to develop an abstract mathematical theory. Having said that, the original motivation behind this area is real-life networks turning up in various scientific fields. Therefore a deep understanding of this topic can help comprehending such social, biological, or computer networks better. Considering the intimate connection of the topic and the notions to large networks and parallel algorithms, one certainly expects that developing this theory will lead to real-life applications as well.