## Final Report Summary - HYPERBOLIC GRAPHS (Hyperbolic random graphs)

The aim of this project is the development of the theory of random geometric graphs on non-Euclidean spaces.

In particular, our focus is the hyperbolic plane, which is a space of constant negative curvature. This theory is a natural next step for the theory of random geometric graphs, whose development started during the late 1960s but it has really exploded during the last 15 years and has mainly focused on geometric graphs on Euclidean spaces. The second and perhaps most timely aim is to explore the potential random graphs on the hyperbolic plane as a model for networks that arise in applications, such as the World Wide Web, the Internet or various collaboration networks as well as social networks. The rapid development of the Internet over the last 15 years and its vast impact on modern western societies has brought the search for a model of real-world networks into the forefront of Science.

Recent work has shown that hyperbolic random graphs exhibit properties that have been observed in a number of real-world

networks. This work has been based mainly on heuristics and computer-aided simulations. The main aim of this project is the rigorous verification of these observations and the exploration of further properties of this class of random graphs that will lead to the development of their mathematical theory.

The random graph model is as follows. Assume that one wants to create a network with N nodes. A reference point is specified on the

hyperbolic plane and a disc of radius R is considered around the reference point. Here, R is a certain function of N that grows logarithmically with N. Thereafter, N random points are chosen within this disc according to a certain distribution which is determined by a specific parameter denoted by a. These points are the nodes of the random graph. Subsequently, any two of them are joined with probability that depends on their hyperbolic distance, independently of any other pair. The probability of connection between two points is such that the closer these are the bigger this probability is.

The degree of a node is the number of other nodes it is attached to. We have determined the distribution of the degrees of the nodes in this

class of random graphs, that is, we have determined the fraction of nodes that have a certain degree. It turns out that the distribution is

closely determined by the parameter a, which "tunes" the way the nodes are distributed on the hyperbolic plane. We determine a range for the parameter a, where the degree distribution follows the so-called power law distribution. This means that the fraction of nodes that have degree equal to k scales like a negative power of k. In fact, this power can be any real number that is less than -2. Such degree distributions have been observed in several networks that arise in applications such as citation networks, collaborations networks, power-grid networks as well as biological networks. We also determine the degree distribution for the remaining values of the parameter a and we find that the structure of the graph becomes significantly different in that the random graph becomes dense.

Another aspect of our research has to do with the component structure of a hyperbolic random graph. We identify critical values for the defining parameters of the model where sudden changes occur on the structure of the random graph. These changes have to do with the connectivity of the random graph as well as with the size of its largest component. More specifically, we show that when the parameter a becomes less than 1/2, the random graph becomes with high probability connected, that is, every two nodes are connected by a path. However, when a is bigger than 1/2, then the random graph is not connected with high probability. In particular, it can be shown that it is very likely that there are nodes that are not connected with any other node. Furthermore, we also identify another critical point which has to do with the size of the largest connected component. Namely, we show that with high probability when a is smaller than 1, then the largest connected component of the random graph contains a certain fraction of the nodes. However, when a is larger than 1, then with high probability all components are small, that is, the random graph breaks into small connected pieces.

Also, we have shown that the fraction of vertices that are contained in the largest connected component is sharply concentrated around a certain constant, which we have been able to determine explicitly. We further showed that the second largest connected component is in fact much smaller and as a fraction of the total number of nodes, it is vanishing.

Next, we focus on the phenomenon of clustering in a hyperbolic random graph. Clustering is a ubiquitous property of networks that naturally emerge in applications. For example, in the context of social networks clustering amounts to the following: two individuals that have a common friend are somewhat more likely to be friends of each other. This reflects a tendency of individuals to form groups or "clusters" according to their socio-economic background, their interests as well as their individual circumstances. These are identified with small sets of nodes on which the density is significantly higher than that of the network itself. The existence of clustering can be measured in several ways. One of these is the so-called clustering coefficient. This is nothing but (essentially) the ratio between the number of triangles of the network divided by the number of triples of vertices that form a path of length 2. The latter represents the number of instances where two individuals in a social network have a common acquaintance. If this ratio remains bounded away from 0 as the number of nodes becomes large, then we say that clustering is present. In the context of hyperbolic random graphs, we show that typically this is the case. Moreover, we have shown that this is concentrated around a single value which we have been able to calculate explicitly.

Our next topic of study is the class of random graphs on the hyperbolic plane as a small world. In particular, we have shown that most pairs of vertices that belong to the same connected component are very close to each other, namely within a number of hops that scales like the double logarithm of the total number of nodes of the network. So in fact we proved what is known as the ultra-small world effect.

Furthermore, we considered dissemination processes on these random graphs. In particular, we considered the class of the so-called bootstrap percolation processes. This is a class of processes that have their origins in statistical physics. The nodes have two states: infected or uninfected. A bootstrap percolation process evolves in rounds. At every round a node that is uninfected becomes infected if it

has at least a certain number of infected neighbours. In its simplest version, it is assumed that this threshold is the same for all nodes and, moreover, a node that becomes infected remains so forever. We analysed this process on random graph on the hyperbolic plane and showed that in fact this class of random graphs has the ability to spread a small initial infection to a large part of the network. We actually determined precisely the amount of initial infection that is needed for this phenomenon to occur.

We also considered the above class of processes to a completely different setting of random graphs, namely inhomogeneous random

graphs. These are random graphs where the vertices are equipped with weights and every edge appears independently with probability that is proportional to the product of the weights. Effectively, this model of random graphs realises the inhomogeneity that is ubiquitous in networks that emerge in applications. The hyperbolic model, which constitutes the thematic core of this project, is a special case of the inhomogeneous model, except for the presence of dependencies between the edges. In this generalised setting, we determined those

conditions on the sequence of weights which ensure that the following phenomenon occurs: a small initially infected set spreads to a large part of the network.

Finally, we considered generalisations of the bootstrap process, where a node samples a subset of its neighbours and sets its state according to the majority within the sample. The main difference with the classical bootstrap process is that a node can switch between

infected and uninfected, whereas in a bootstrap process once a node becomes infected it stays so forever. Also, these processes are more realistic, as they do not assume knowledge of the state of all the neighbours of any given node but only a small sample of them. We consider this process in the context of random networks that are created through the preferential attachment model. We show that the process does not fluctuate forever but in fact converges to unanimity, that is, at the end all nodes have the state of the initial majority.

In particular, our focus is the hyperbolic plane, which is a space of constant negative curvature. This theory is a natural next step for the theory of random geometric graphs, whose development started during the late 1960s but it has really exploded during the last 15 years and has mainly focused on geometric graphs on Euclidean spaces. The second and perhaps most timely aim is to explore the potential random graphs on the hyperbolic plane as a model for networks that arise in applications, such as the World Wide Web, the Internet or various collaboration networks as well as social networks. The rapid development of the Internet over the last 15 years and its vast impact on modern western societies has brought the search for a model of real-world networks into the forefront of Science.

Recent work has shown that hyperbolic random graphs exhibit properties that have been observed in a number of real-world

networks. This work has been based mainly on heuristics and computer-aided simulations. The main aim of this project is the rigorous verification of these observations and the exploration of further properties of this class of random graphs that will lead to the development of their mathematical theory.

The random graph model is as follows. Assume that one wants to create a network with N nodes. A reference point is specified on the

hyperbolic plane and a disc of radius R is considered around the reference point. Here, R is a certain function of N that grows logarithmically with N. Thereafter, N random points are chosen within this disc according to a certain distribution which is determined by a specific parameter denoted by a. These points are the nodes of the random graph. Subsequently, any two of them are joined with probability that depends on their hyperbolic distance, independently of any other pair. The probability of connection between two points is such that the closer these are the bigger this probability is.

The degree of a node is the number of other nodes it is attached to. We have determined the distribution of the degrees of the nodes in this

class of random graphs, that is, we have determined the fraction of nodes that have a certain degree. It turns out that the distribution is

closely determined by the parameter a, which "tunes" the way the nodes are distributed on the hyperbolic plane. We determine a range for the parameter a, where the degree distribution follows the so-called power law distribution. This means that the fraction of nodes that have degree equal to k scales like a negative power of k. In fact, this power can be any real number that is less than -2. Such degree distributions have been observed in several networks that arise in applications such as citation networks, collaborations networks, power-grid networks as well as biological networks. We also determine the degree distribution for the remaining values of the parameter a and we find that the structure of the graph becomes significantly different in that the random graph becomes dense.

Another aspect of our research has to do with the component structure of a hyperbolic random graph. We identify critical values for the defining parameters of the model where sudden changes occur on the structure of the random graph. These changes have to do with the connectivity of the random graph as well as with the size of its largest component. More specifically, we show that when the parameter a becomes less than 1/2, the random graph becomes with high probability connected, that is, every two nodes are connected by a path. However, when a is bigger than 1/2, then the random graph is not connected with high probability. In particular, it can be shown that it is very likely that there are nodes that are not connected with any other node. Furthermore, we also identify another critical point which has to do with the size of the largest connected component. Namely, we show that with high probability when a is smaller than 1, then the largest connected component of the random graph contains a certain fraction of the nodes. However, when a is larger than 1, then with high probability all components are small, that is, the random graph breaks into small connected pieces.

Also, we have shown that the fraction of vertices that are contained in the largest connected component is sharply concentrated around a certain constant, which we have been able to determine explicitly. We further showed that the second largest connected component is in fact much smaller and as a fraction of the total number of nodes, it is vanishing.

Next, we focus on the phenomenon of clustering in a hyperbolic random graph. Clustering is a ubiquitous property of networks that naturally emerge in applications. For example, in the context of social networks clustering amounts to the following: two individuals that have a common friend are somewhat more likely to be friends of each other. This reflects a tendency of individuals to form groups or "clusters" according to their socio-economic background, their interests as well as their individual circumstances. These are identified with small sets of nodes on which the density is significantly higher than that of the network itself. The existence of clustering can be measured in several ways. One of these is the so-called clustering coefficient. This is nothing but (essentially) the ratio between the number of triangles of the network divided by the number of triples of vertices that form a path of length 2. The latter represents the number of instances where two individuals in a social network have a common acquaintance. If this ratio remains bounded away from 0 as the number of nodes becomes large, then we say that clustering is present. In the context of hyperbolic random graphs, we show that typically this is the case. Moreover, we have shown that this is concentrated around a single value which we have been able to calculate explicitly.

Our next topic of study is the class of random graphs on the hyperbolic plane as a small world. In particular, we have shown that most pairs of vertices that belong to the same connected component are very close to each other, namely within a number of hops that scales like the double logarithm of the total number of nodes of the network. So in fact we proved what is known as the ultra-small world effect.

Furthermore, we considered dissemination processes on these random graphs. In particular, we considered the class of the so-called bootstrap percolation processes. This is a class of processes that have their origins in statistical physics. The nodes have two states: infected or uninfected. A bootstrap percolation process evolves in rounds. At every round a node that is uninfected becomes infected if it

has at least a certain number of infected neighbours. In its simplest version, it is assumed that this threshold is the same for all nodes and, moreover, a node that becomes infected remains so forever. We analysed this process on random graph on the hyperbolic plane and showed that in fact this class of random graphs has the ability to spread a small initial infection to a large part of the network. We actually determined precisely the amount of initial infection that is needed for this phenomenon to occur.

We also considered the above class of processes to a completely different setting of random graphs, namely inhomogeneous random

graphs. These are random graphs where the vertices are equipped with weights and every edge appears independently with probability that is proportional to the product of the weights. Effectively, this model of random graphs realises the inhomogeneity that is ubiquitous in networks that emerge in applications. The hyperbolic model, which constitutes the thematic core of this project, is a special case of the inhomogeneous model, except for the presence of dependencies between the edges. In this generalised setting, we determined those

conditions on the sequence of weights which ensure that the following phenomenon occurs: a small initially infected set spreads to a large part of the network.

Finally, we considered generalisations of the bootstrap process, where a node samples a subset of its neighbours and sets its state according to the majority within the sample. The main difference with the classical bootstrap process is that a node can switch between

infected and uninfected, whereas in a bootstrap process once a node becomes infected it stays so forever. Also, these processes are more realistic, as they do not assume knowledge of the state of all the neighbours of any given node but only a small sample of them. We consider this process in the context of random networks that are created through the preferential attachment model. We show that the process does not fluctuate forever but in fact converges to unanimity, that is, at the end all nodes have the state of the initial majority.