Periodic Reporting for period 1 - TIMING (Temporal Innovative Model for Imitative Network Generation: a framework to analyse temporal networks and generate surrogates)
Okres sprawozdawczy: 2023-10-01 do 2025-01-31
Networks are a reference representation tool in the physics of complex systems, able to describe systems composed of multiple interacting agents. Formed by a set of discrete nodes and the connections between them, networks schematise the existing interactions among elements, providing a representative picture of the system architecture. Network science has revolutionised data analysis and modelling by introducing a new way to describe relationships between constituent elements in many disciplines, from physics to sociology, biology, and economy.
In many cases, agents’ interactions undergo a temporal evolution, with links appearing and disappearing over time, and their description requires temporal networks. This framework is fundamental in many settings, like neuronal functions and ecosystems, but is particularly useful to describe social contexts, where connections among people spontaneously change over time, both in physical and remote interactions, with non-trivial temporal correlations and structures.
Both static and temporal networks allow to schematise dynamical processes that can be simulated on their discrete topologies, including, for instance, spreading phenomena (of diseases, opinions or information), transportation models, communication, synchronisation, and consensus formation. Evidence in the literature suggests that the properties of these collective behaviours strongly depend on the structure of the underlying network and its temporal evolution. Hence, an accurate description of these processes requires temporal networks able to reflect real-world time-depending patterns. Unfortunately, information about real temporal sequences of interactions is usually incomplete due to the difficulty of collecting suitable datasets, and only very small real temporal networks are typically obtained.
In this context, synthetic networks that mimic the observed complex patterns of real structures can serve as surrogate substrates on which to simulate processes. Such surrogates can be generated with a different temporal extension from the one that it takes as input and can therefore be used for augmenting data, providing a solution to the problem of data with limited duration.
The main goal of TIMING has been to develop an algorithm to generate such synthetic networks.
The framework developed by TIMING, differently from the existing models for temporal network generation, (i) considers both time and topology simultaneously, (ii) uses a mesoscale approach for both dimensions (taking into account both the microscopic characteristics of each node and the macroscopic network features), and (iii) combines theoretical models with an emulative algorithm to reproduce the complexity of specific empirical spatio-temporal patterns.
As a consequence, the resulting surrogate networks display a complex interplay of structural and temporal properties similar to the one that we can observe in the real networks that we use as input. Moreover, we have shown that simulations of a variety of dynamical processes on the surrogate networks yield outcomes similar to the ones obtained by simulations on the empirical data. The dynamical processes that we have tested are three and respectively describe epidemic spread, opinion formation and emergence of norms in a population. The fact that simulations of such various processes on the surrogate networks provide a similar phenomenology as on the original data highlights the versatility of our method.
The attached figure schematically displays the idea of the algorithm and some results that compare the obtained networks with the original data, highlighting their similarity.
An important use of surrogate networks consists in providing substrates with realistic properties on which dynamical processes can be studied on various enough time scales, even if long enough datasets are not available. An important example is given by simulations of realistic spreading processes, which often have longer timescales than most available datasets. It is then typically needed to use multiple repetitions of the same temporal network, which has consequences on the variability of interactions and hence on the realism of the observed behaviors. The methodology that we have developed makes it possible to circumvent this difficulty by generating surrogate data of the needed length, avoiding to repeat exactly the same patterns.
A further development of the present algorithm that has not been published yet is to use an existing dataset and produce realistic surrogate data with larger population sizes. This yields the important benefit of being able to simulate dynamical processes on networks of large size without needing to actually collect the corresponding data (hence, for instance, without privacy concerns). In fact, while this methodology allows to augment original data, it also consents to protect privacy in the case of social data. It is therefore possible to generate synthetic data that are arbitrarily large and free from privacy issues.
Moreover, an additional advantage relies on the versatility of this method: it is applicable to any temporal network given as input, without limits of size or context, and can be extended to directed, weighted, and labelled temporal networks.
The main byproduct is an algorithm for surrogate network generation, data augmentation and sharing that has been made available to all researchers via my GitHub page (https://github.com/giuliacencetti/Surrogate_net_generation(odnośnik otworzy się w nowym oknie)).